# Pronosoft Football Data - SQL | Database & Queries

---

Jupyter Notebook to 
- Run build_database.py
    - Create 'Football_Data' database with SQLite
    - Create 'Football_Bets' table
    - Update table with pronosft_data CSV


- Explore database
    - Make a few exploratory queries with Python (_recommended to use Datagrip or other_)

## SQLite VS PostgreSQL

Considered using PostgreSQL, however for this solo project SQLite is simpler and more efficient

Info @ https://tableplus.com/blog/2018/08/sqlite-vs-postgresql-which-database-to-use-and-why.html

---

---

# Paths

In [1]:
path_pronosoft_data = '..\Data\pronosoft_data.csv'
path_pronosoft_data_extended = '..\Data\pronosoft_data_extended.csv'

name_database = 'Football_Data.sqlite'
path_database = '..\\SQL\\' + name_database

---

# Libraries

In [1]:
import pandas as pd
import sqlite3

---

# Functions

Query to Dataframe

In [3]:
def query_to_df(database, query):
    conn = sqlite3.connect(database)  
    c = conn.cursor()
    c.execute(query)
    
    # EXTRACT DATAFRAME
    fields = [description[0] for description in c.description]
    query_df = pd.DataFrame(c.fetchall(), columns = fields)
    
    conn.close()
    
    return query_df

---

---

# Create Database & Table

In [4]:
!python build_database.py

---

---

# Explore Database

Connect to database

In [4]:
# !pip install ipython-sql

%load_ext sql
%sql sqlite:///../SQL/Football_Data.sqlite

In [6]:
%%sql 

SELECT * FROM Football_Bets
LIMIT 5

 * sqlite:///../SQL/Football_Data.sqlite
Done.


date,league,time,team_1_name,team_2_name,team_1_prob,team_1_bet_odds,nul_prob,nul_bet_odds,team_2_prob,team_2_bet_odds,prediction_team_pronosoft,under_prob,under_bet_odds,over_prob,over_bet_odds,prediction_uo_pronosoft,team_1_score,team_2_score
01-10-2018,France - Ligue 2,20:45,Troyes,Auxerre,0.39,2.75,0.18,3.1,0.43,2.8,,0.68,1.46,0.32,2.0,U,1,0
01-10-2018,Espagne - Liga Espagnole,21:00,Celta Vigo,Getafe,0.33,2.15,0.38,3.2,0.3,3.6,N,0.51,1.53,0.49,1.88,,1,1
01-10-2018,Angleterre - Premier League,21:00,Bournemouth,Crystal Palace,0.39,2.2,0.25,3.4,0.36,3.2,,0.4,1.85,0.6,1.65,O,2,1
01-10-2018,Italie - Serie A,20:30,Sampdoria,Spal,0.41,1.72,0.25,3.5,0.33,4.8,,0.57,1.67,0.43,1.85,,2,1
01-10-2018,Portugal - Primeira Liga,21:15,Aves,Portimonense,0.4,2.6,0.3,3.25,0.3,2.45,,0.69,1.66,0.31,1.73,U,3,0


In [7]:
result = _

query_df_1 = result.DataFrame()

---

## Query to Dataframe

Results in Python for processing, visualization, etc.

In [8]:
path_query_ex = '..\SQL\prob_from_odds.sql'

with open(path_query_ex, mode = 'r')as file:
        query_ex = file.read()

In [9]:
query_df_2 = query_to_df(database = path_database, query = query_ex)
query_df_2.head()

Unnamed: 0,date,team_1_name,team_2_name,team_1_prob,team_1_prob_from_odds,team_1_bet_odds,nul_prob,nul_prob_from_odds,nul_bet_odds,team_2_prob,team_2_prob_from_odds,team_2_bet_odds,match_outcome,team_1_profit,nul_profit,team_2_profit
0,01-10-2018,Troyes,Auxerre,0.39,0.35,2.75,0.18,0.31,3.1,0.43,0.34,2.8,1,1.75,-1.0,-1.0
1,01-10-2018,Celta Vigo,Getafe,0.33,0.44,2.15,0.38,0.3,3.2,0.3,0.26,3.6,N,-1.0,2.2,-1.0
2,01-10-2018,Bournemouth,Crystal Palace,0.39,0.43,2.2,0.25,0.28,3.4,0.36,0.29,3.2,1,1.2,-1.0,-1.0
3,01-10-2018,Sampdoria,Spal,0.41,0.54,1.72,0.25,0.27,3.5,0.33,0.19,4.8,1,0.72,-1.0,-1.0
4,01-10-2018,Aves,Portimonense,0.4,0.35,2.6,0.3,0.28,3.25,0.3,0.37,2.45,1,1.6,-1.0,-1.0


---

---