### Jira Tickets Solution
This notebook describes a solution for Jira Tickets creanted in `create_jira_tickets.py`

In [1]:
import os
import sys
import logging
import pandas as pd
from pathlib import Path

# since notebook is outside of the src directory, we need to add the src directory to the path
project_root = Path.home() / "dev" / "data_analyzer"
if project_root not in sys.path:
    sys.path.append(str(project_root))

from src.clients.db_client import DatabaseClient
from src.models.schemas import QueryResult

logging.basicConfig(level=logging.INFO)

In [2]:
# db connection
DB_PATH = os.path.expanduser("../data/porsche_analytics.db")
sqlite_connection_string = f"sqlite:///{DB_PATH}"

sqlite_client = DatabaseClient(sqlite_connection_string)
print(f"Connected to the database!")

Connected to the database!


### Task 1: Car Models Analysis
- How many unqiue car models we have per car category? Sort the results in descending order!

In [6]:
query = """
SELECT
    segment,
    COUNT(DISTINCT model_code) AS models_unq_count
FROM models
GROUP BY segment
ORDER BY models_unq_count DESC
"""
result = sqlite_client.execute_query(query)
result_df = pd.DataFrame(result.data)
result_df

INFO:src.clients.db_client:Query executed successfully. Returned 7 rows.


Unnamed: 0,segment,models_unq_count
0,Sports Car,5
1,SUV,3
2,Wagon,1
3,Supercar,1
4,Sedan,1
5,Luxury,1
6,Hypercar,1


### Task 2: Dealership Performance by Region Analysis
- Analyze the average dealership rating and sales capacity by region. Which regions have the highest performing dealerships? Sort the results by average rating in descending order.

In [8]:
query = """
SELECT
    region,
    ROUND(AVG(rating), 2) AS average_rating,
    ROUND(AVG(sales_capacity), 1) AS average_sales_capacity,
    COUNT(*) AS dealership_count
FROM
    dealerships
GROUP BY
    region
ORDER BY
    average_rating DESC;
"""
result = sqlite_client.execute_query(query)
result_df = pd.DataFrame(result.data)
result_df

INFO:src.clients.db_client:Query executed successfully. Returned 4 rows.


Unnamed: 0,region,average_rating,average_sales_capacity,dealership_count
0,Middle East,4.8,40.0,1
1,North America,4.65,32.5,2
2,Europe,4.63,24.3,4
3,Asia Pacific,4.63,21.0,3


### Task 3: Service Cost Analysis by Model and Service Type
- Analyze the average service costs by model and service type

In [10]:
query = """
SELECT
    CASE 
        WHEN is_electric = 1 THEN 'Electric' 
        ELSE 'Conventional' 
    END AS model_type,
    COUNT(*) AS model_count,
    ROUND(AVG(base_price), 2) AS average_price_usd,
    ROUND(AVG(horsepower), 1) AS average_horsepower,
    ROUND(AVG(base_price / horsepower), 2) AS price_per_hp_ratio
FROM
    models
GROUP BY
    is_electric
ORDER BY
    model_type;
"""
result = sqlite_client.execute_query(query)
result_df = pd.DataFrame(result.data)
result_df

INFO:src.clients.db_client:Query executed successfully. Returned 2 rows.


Unnamed: 0,model_type,model_count,average_price_usd,average_horsepower,price_per_hp_ratio
0,Conventional,10,132860.0,396.9,298.79
1,Electric,3,341800.0,626.0,439.51


### Task 4: Electric vs Conventional Model Analysis
- Analyze which colors are most popular for different Porsche models. Identify which models have higher maintenance costs and which service types contribute most to overall service revenue
    - `Query`: Analyze which colors are most popular for different Porsche models -> table
    - `EDA`: Identify which models have higher maintenance costs and which service types contribute most to overall service revenue?

In [14]:
query = """
SELECT 
    m.model_name,
    sr.service_type,
    COUNT(*) AS service_count,
    ROUND(AVG(sr.cost), 2) AS average_cost,
    MIN(sr.cost) AS min_cost,
    MAX(sr.cost) AS max_cost
FROM 
    service_records sr
JOIN 
    sales s ON sr.vin = s.vin
JOIN 
    models m ON s.model_id = m.model_id
GROUP BY 
    m.model_name, sr.service_type
ORDER BY 
    m.model_name, average_cost DESC;
"""
result = sqlite_client.execute_query(query)
result_df = pd.DataFrame(result.data)
result_df

INFO:src.clients.db_client:Query executed successfully. Returned 10 rows.


Unnamed: 0,model_name,service_type,service_count,average_cost,min_cost,max_cost
0,718 Boxster,Performance Upgrade,1,3800.0,3800.0,3800.0
1,911 Carrera,Regular Maintenance,1,850.0,850.0,850.0
2,911 GT3,Regular Maintenance,1,1100.0,1100.0,1100.0
3,911 Turbo S,Regular Maintenance,1,950.0,950.0,950.0
4,Cayenne,Regular Maintenance,1,1450.0,1450.0,1450.0
5,Cayenne Coupe,Regular Maintenance,1,1350.0,1350.0,1350.0
6,Macan,Interior Repair,1,1200.0,1200.0,1200.0
7,Panamera,Electrical System,1,950.0,950.0,950.0
8,Taycan,Brake Service,1,1200.0,1200.0,1200.0
9,Taycan Cross Turismo,Tire Replacement,1,2800.0,2800.0,2800.0


**EDA Part (LLM generated)**

_Analysis of Porsche Service Costs_

Based on the provided service data, here are the key insights regarding maintenance costs and revenue contributions:

**Models with Higher Maintenance Costs**
- 718 Boxster - Highest average cost at $3,800 (Performance Upgrade)
- Taycan Cross Turismo - Second highest at $2,800 (Tire Replacement)
- Cayenne - Third highest at $1,450 (Regular Maintenance)
- Cayenne Coupe - Fourth at $1,350 (Regular Maintenance)

These luxury/performance models show significantly higher service costs than other models in the lineup.

**Service Types by Revenue Contribution**
- Regular Maintenance - Highest total revenue at $5,700 (5 services)
    - Appears across multiple models (911 series, Cayenne models)
    - Average cost: $1,140
- Performance Upgrade - Second highest at $3,800 (1 service)
    - Highest per-service revenue generator
    - Only recorded for 718 Boxster
- Tire Replacement - Third highest at $2,800 (1 service)
    - Second highest individual service cost
    - Only recorded for Taycan Cross Turismo

**Key Observations**
- SUV models (Cayenne, Cayenne Coupe) have higher regular maintenance costs than sedan models
- Performance-oriented services generate the highest per-service revenue
- Regular maintenance represents the largest revenue stream due to frequency
- Electric models (Taycan series) have specialized high-cost services (brake service, tire replacement)


In [9]:
query = """
SELECT * 
FROM dealerships
"""
result = sqlite_client.execute_query(query)
result_df = pd.DataFrame(result.data)
result_df

INFO:src.clients.db_client:Query executed successfully. Returned 10 rows.


Unnamed: 0,dealership_id,name,address,city,country,region,opening_date,service_center,sales_capacity,rating,manager_name
0,1,Porsche New York City,"711 11th Avenue, New York, NY 10019",New York,United States,North America,2000-03-15,1,30,4.7,John Smith
1,2,Porsche Berlin,"Franklinstrasse 23, 10587 Berlin",Berlin,Germany,Europe,1998-06-20,1,25,4.8,Hans Mueller
2,3,Porsche Los Angeles,"8425 Wilshire Blvd, Beverly Hills, CA 90211",Los Angeles,United States,North America,1999-11-10,1,35,4.6,David Johnson
3,4,Porsche Tokyo,"2-chōme-6-15 Roppongi, Minato City, Tokyo",Tokyo,Japan,Asia Pacific,2005-04-30,1,20,4.9,Takeshi Tanaka
4,5,Porsche London,"27 Berkeley Square, Mayfair, London W1J 6DS",London,United Kingdom,Europe,1997-09-12,1,22,4.5,Emma Wilson
5,6,Porsche Dubai,"Sheikh Zayed Rd, Dubai",Dubai,United Arab Emirates,Middle East,2010-01-18,1,40,4.8,Ahmed Al-Farsi
6,7,Porsche Melbourne,"121 Swan St, Richmond VIC 3121",Melbourne,Australia,Asia Pacific,2008-07-22,1,18,4.6,Sarah Johnson
7,8,Porsche Munich,"Olof-Palme-Straße 35, 81829 München",Munich,Germany,Europe,1992-05-05,1,30,4.7,Franz Weber
8,9,Porsche Shanghai,"888 Tianshan Road, Shanghai",Shanghai,China,Asia Pacific,2012-11-28,1,25,4.4,Li Wei
9,10,Porsche Paris,"73 Avenue des Champs-Élysées, 75008 Paris",Paris,France,Europe,1994-02-14,1,20,4.5,Claire Dubois
