## Relatório de times mandantes

O principal objetivo deste script é gerar um relatório para a gerência com métricas dos times mandantes.

#### Import das libs

In [25]:
#Importando das libs que serão utilizadas no processo
import sqlite3
import pandas as pd

#### Conectando ao banco e iniciando o cursor

In [26]:
#conectando ao banco de dados test_analytics_engineer
conn = sqlite3.connect('test_analytics_engineer.db')

In [27]:
#instanciando o cursor
c = conn.cursor()

In [30]:
#recuperando dados do banco
df = pd.read_sql('''
            SELECT * FROM player_attributes_modified
''',conn)

df.columns

Index(['player_fifa_api_id', 'attribute_id', 'date', 'overall_rating',
       'potential', 'preferred_foot', 'right', 'attacking_work_rate',
       'defensive_work_rate', 'crossing', 'finishing', 'heading_accuracy',
       'short_passing', 'volleys', 'dribbling', 'curve', 'free_kick_accuracy',
       'long_passing', 'ball_control', 'acceleration', 'sprint_speed',
       'agility', 'reactions', 'balance', 'shot_power', 'jumping', 'stamina',
       'strength', 'long_shots', 'aggression', 'interceptions', 'positioning',
       'vision', 'penalties', 'marking', 'standing_tackle', 'sliding_tackle',
       'gk_diving', 'gk_handling', 'gk_kicking', 'gk_positioning',
       'gk_reflexes'],
      dtype='object')

#### Criando a query de extração do relatório

In [24]:
#recuperando dados do banco
df = pd.read_sql('''
            WITH home_team AS (
                SELECT
                    refined_match.season,
                    refined_league.name AS league_name,
                    refined_team.team_api_id,
                    refined_team.team_long_name AS team_name,
                    refined_match.home_team_goal AS goals,
                    refined_match.home_team_wins AS win,
                    refined_match.draw_match AS draw,
                    CASE WHEN refined_match.away_team_wins THEN TRUE ELSE FALSE END AS lose,
                    CASE
                        WHEN refined_match.home_team_wins THEN 3
                        WHEN refined_match.draw_match THEN 1
                        ELSE 0 END AS points
                FROM
                    refined_match
                INNER JOIN
                    refined_team
                ON refined_team.team_api_id = refined_match.home_team_api_id
                INNER JOIN
                    refined_league
                ON refined_match.league_id = refined_league.id
            ),
            
            teams_statistics AS (
                SELECT
                    home_team.season,
                    home_team.league_name,
                    home_team.team_name,
                    COUNT(home_team.team_api_id) AS matches_total,
                    SUM(home_team.goals) AS goals_total,
                    AVG(home_team.goals) AS goals_mean,
                    SUM(home_team.win) AS wins,
                    SUM(home_team.draw) AS draws,
                    SUM(home_team.lose) AS loses,
                    SUM(home_team.points) AS points
                FROM
                    home_team
                GROUP BY
                    home_team.season,
                    home_team.league_name,
                    home_team.team_name
            )
            
            SELECT
                *,
                RANK () OVER (PARTITION BY season,league_name ORDER BY points DESC,wins DESC,goals_total DESC,draws DESC) AS position
            FROM 
                teams_statistics
                
''',conn)

df.head(50)

Unnamed: 0,season,league_name,team_name,matches_total,goals_total,goals_mean,wins,draws,loses,points,position
0,2008/2009,Belgium Jupiler League,Standard de Liège,17,42,2.470588,15,0,2,45,1
1,2008/2009,Belgium Jupiler League,RSC Anderlecht,17,48,2.823529,14,1,2,43,2
2,2008/2009,Belgium Jupiler League,Club Brugge KV,17,37,2.176471,11,2,4,35,3
3,2008/2009,Belgium Jupiler League,SV Zulte-Waregem,17,31,1.823529,9,6,2,33,4
4,2008/2009,Belgium Jupiler League,KVC Westerlo,17,23,1.352941,10,2,5,32,5
5,2008/2009,Belgium Jupiler League,Beerschot AC,17,32,1.882353,9,3,5,30,6
6,2008/2009,Belgium Jupiler League,KAA Gent,17,31,1.823529,9,3,5,30,7
7,2008/2009,Belgium Jupiler League,Royal Excel Mouscron,17,28,1.647059,9,3,5,30,8
8,2008/2009,Belgium Jupiler League,Sporting Charleroi,17,26,1.529412,9,3,5,30,9
9,2008/2009,Belgium Jupiler League,KSV Cercle Brugge,17,29,1.705882,9,2,6,29,10


**Sobre as métricas**

Foram criadas considerando não somente o time mas também a temporada e a liga à qual pertencem. Isso possibilita que o gerente consiga realizar quebras para analisar a performance dos times.

A coluna `position` é um ranking criado para identificar a performance dos times jogando como mandante e tem os seguintes critérios como ordenação:
- Maior número de pontos
- Maior número de vitórias
- Maior número de gols
- Maior número de empates

**Sobre a automatização**

A automatização dependeria das ferramentas disponíveis. Imagino que esta query poderia ser automatizada para criar uma tabela chamada `home_team_report` (ou algo do tipo) utilizando alguma ferramenta de ETL/ELT, como por exemplo Dataform ou dbt, e posteriormente criar uma consulta simples em cima da tabela criada para automatizar visões em alguma ferramenta de Dataviz, como por exemplo Metabase ou Power BI.