# Insert Results

This script fetches the data from the results.csv from minIO and then fetch the data from race.csv to a data frame

#### Logic:

if we examine the schema of results table in postgres, it looks like as follows:
    result_id INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    event_id INT REFERENCES events(event_id),
    driver_id INT REFERENCES drivers(driver_id),
    constructor_id INT REFERENCES constructors(constructor_id),
    number INT,
    grid INT,
    position INT,
    points DECIMAL,
    laps INT,
    time VARCHAR(255),
    fastest_lap_time VARCHAR(255),
    rank INT,
    fastest_lap_speed DECIMAL,
    status_id INT REFERENCES status(status_id)

there are some id columns that are beging referenced from the parent table. We are not supposed to load tese columns from the results.csv rather these should be automatically referenced from other parent tables. If those parent are not present, then these would not flow in and this affects the referential integrity.

Hence in order to get the accurate values of these columns we need to load the other tables so that ids are generated for them and they flow in when we load results table.

We would have to load the data for following tables:
1. Events
2. Drivers
3. Constructor
4. Status
5. Results

key pointers to talk about:
- nationality to country mapping is missin
- a lot of recrods seem to be dirty but would need to understant the business rule first to clean them

In [1]:
import pandas as pd
import numpy as np
from io import BytesIO
from minio import Minio
from sqlalchemy import create_engine, text
from datetime import datetime
import re
import logging
import psycopg2
# from psycopg2.extras import execute_batch

In [2]:
# logging.basicConfig(format='%(asctime)s %(levelname)s - %(message)s', level=logging.INFO)
logger =logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s.%(msecs)03d %(levelname)s %(module)s - %(funcName)s: %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S',
)
logger = logging.getLogger(__name__)
# logger=logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s:%(message)s')
# logger = logging.getLogger()

In [3]:
# data = minio_client.get_object("track.data-raw", "results.csv")
# results_data = data.read().decode('utf-8')
# # data_cleaned = re.sub(r'\\N', '', results_data)
# cleaned_data = BytesIO(data_cleaned.encode('utf-8'))
# columns_to_replace = ['rank']
# df_results = pd.read_csv(cleaned_data)
# df_results = df_results.fillna(0)

In [4]:
try:
    logger.info("Trying MinIO Initilization.")
    # Initialize Minio client
    minio_client = Minio(
        "minio:9000",
        access_key="minioadmin",
        secret_key="minioadmin",
        secure=False
    )
    logger.info("MinIO Initialized .")
except Exception as e:
    logger.error("MinIO client initialisation error: %s", e)

try:
    logger.info("Downloading results.csv from minIO object store")
    results = minio_client.get_object("track.data-raw", "results.csv")
    results = BytesIO(results.read())
    result_csv = pd.read_csv(results)
    logger.info("Results file with %s records downloaded and read into DataFrame 'result_csv'", len(result_csv))

    logger.info("Downloading races.csv from minIO object store")
    #Due to Postgres foreign key restraints other csv files must be loaded into repsective tables before results may be loaded in.
    races = minio_client.get_object("track.data-raw", "races.csv")
    races = BytesIO(races.read())
    race_csv = pd.read_csv(races)
    logger.info("Races file with %s records downloaded and read into DataFrame 'race_csv'", len(race_csv))

    logger.info("Downloading drivers.csv from minIO object store")
    drivers = minio_client.get_object("track.data-raw", "drivers.csv")
    drivers = BytesIO(drivers.read())
    drivers_csv = pd.read_csv(drivers)
    logger.info("drivers file with %s records downloaded and read into DataFrame 'drivers_csv'", len(drivers_csv))

    logger.info("Downloading constructors.csv from minIO object store")
    constructors = minio_client.get_object("track.data-raw", "constructors.csv")
    constructors = BytesIO(constructors.read())
    constructors_csv = pd.read_csv(constructors)
    logger.info("constructors file with %s records downloaded and read into DataFrame 'constructors_csv'", len(constructors_csv))

    logger.info("Downloading status.csv from minIO object store")
    status = minio_client.get_object("track.data-raw", "status.csv")
    status = BytesIO(status.read())
    status_csv = pd.read_csv(status)
    logger.info("status file with %s records downloaded and read into DataFrame 'status_csv'", len(status_csv))
    
    
    # # Download the CSV file from the Minio bucket into a pandas DataFrame
    # data = minio_client.get_object("track.data-raw", "results.csv")
    # results_data = data.read().decode('utf-8')
    # data_cleaned = re.sub(r'\\N', '', results_data)
    # cleaned_data = BytesIO(data_cleaned.encode('utf-8'))
    # columns_to_replace = ['rank']
    # df_results = pd.read_csv(cleaned_data)
    # df_results = df_results.fillna(0)
    # logger.info("Races file with %s records downloaded and read into DataFrame 'df_csv'", len(df_result))
    
    # logger.info("Downloading races.csv from minIO object store")
    # # Download the CSV file from the Minio bucket into a pandas DataFrame
    # data = minio_client.get_object("track.data-raw", "races.csv")
    # race_data = data.read().decode('utf-8')
    # data_cleaned = re.sub(r'\\N', '', race_data)
    # cleaned_data = BytesIO(data_cleaned.encode('utf-8'))
    # df_race = pd.read_csv(cleaned_data)
    # df_race = df_race.fillna('')
    
    
except Exception as e:
    logger.error("File download/read error: %s", e)
# pd.set_option('display.max_rows', None)

2023-10-02 08:02:17.375 INFO 1902863203 - <module>: Trying MinIO Initilization.
2023-10-02 08:02:17.378 INFO 1902863203 - <module>: MinIO Initialized .
2023-10-02 08:02:17.380 INFO 1902863203 - <module>: Downloading results.csv from minIO object store
2023-10-02 08:02:17.488 INFO 1902863203 - <module>: Results file with 26160 records downloaded and read into DataFrame 'result_csv'
2023-10-02 08:02:17.489 INFO 1902863203 - <module>: Downloading races.csv from minIO object store
2023-10-02 08:02:17.499 INFO 1902863203 - <module>: Races file with 1101 records downloaded and read into DataFrame 'race_csv'
2023-10-02 08:02:17.499 INFO 1902863203 - <module>: Downloading drivers.csv from minIO object store
2023-10-02 08:02:17.507 INFO 1902863203 - <module>: drivers file with 858 records downloaded and read into DataFrame 'drivers_csv'
2023-10-02 08:02:17.508 INFO 1902863203 - <module>: Downloading constructors.csv from minIO object store
2023-10-02 08:02:17.514 INFO 1902863203 - <module>: con

# Loading all the data for required parents first. it is important to load the following tables first in order to maintaing the referential integrity constraints

In [5]:
# try:
#     logger.info("Cleansing and normalisation initiated.")
#     # remove special characters and non-alphanumeric characters
#     remove_special_chars = lambda text: re.sub(r'[^a-zA-Z0-9\s]', '', text)
    
#     # data cleansing and normalization for accent characters
    
#     clean_normalize = lambda s: replace('\\N', 99999)  \
#                                      .strip()
    
#     # Creating list of columns to clean
#     columns_to_clean_db = ['circuit_reference', 'name', 'location']
#     columns_to_clean_csv = ['circuitRef', 'name', 'location']
    
#     # Apply the lambda functions to the specified columns in both DataFrames
#     for column in columns_to_clean_db:
#         # fill na values with bulls before cleaning
#         df_db[column] = df_db[column].fillna('').apply(str)
#         #normalise and cleanse
#         df_db[column] = df_db[column].apply(clean_normalize).apply(remove_special_chars)
#     for column in columns_to_clean_csv:
#         # fill na values with bulls before cleaning
#         df_csv[column] = df_csv[column].fillna('').apply(str)
#         #normalise and cleanse
#         df_csv[column] = df_csv[column].apply(clean_normalize).apply(remove_special_chars)
#     logger.info("Cleansing and normalisation completed.")
# except Exception as e:
#     logger.error("Error %s occured during cleansing and standardisation", e)

In [6]:
try:
    logger.info("Trying database connection.")
    engine = create_engine('postgresql://admin:admin@pgdb/postgres')
    logger.info("Database connection successful.")
except Exception as e:
    logger.error("Error during database connection: %s", e)

2023-10-02 08:02:17.544 INFO 3559023953 - <module>: Trying database connection.
2023-10-02 08:02:17.595 INFO 3559023953 - <module>: Database connection successful.


In [7]:
df_db_country = pd.read_sql("SELECT * FROM race_data.countries;", engine)

In [8]:
df_db_country

Unnamed: 0,country_id,name
0,1,Canada
1,2,Austria
2,3,United Kingdom
3,4,Spain
4,5,Belgium
5,6,Italy
6,7,Russia
7,8,Germany
8,9,Portugal
9,10,Bahrain


In [9]:
drivers_csv.dtypes

driverId        int64
driverRef      object
number         object
code           object
forename       object
surname        object
dob            object
nationality    object
url            object
dtype: object

In [10]:
merged_df = pd.merge(drivers_csv, df_db_country, left_on='nationality', right_on='name', how='left')

In [11]:
merged_df['number'] = merged_df['number'].replace(to_replace = '\\N', value = 0000)

In [12]:
merged_df=merged_df.drop(columns=['driverId','url','nationality','name'])
new_column_names = {
        'driverRef': 'driver_ref'
        ,'country_id': 'nationality'
    }
merged_df.rename(columns=new_column_names, inplace=True)

In [13]:
merged_df

Unnamed: 0,driver_ref,number,code,forename,surname,dob,nationality
0,hamilton,44,HAM,Lewis,Hamilton,1985-01-07,
1,heidfeld,0,HEI,Nick,Heidfeld,1977-05-10,
2,rosberg,6,ROS,Nico,Rosberg,1985-06-27,
3,alonso,14,ALO,Fernando,Alonso,1981-07-29,
4,kovalainen,0,KOV,Heikki,Kovalainen,1981-10-19,
...,...,...,...,...,...,...,...
853,zhou,24,ZHO,Guanyu,Zhou,1999-05-30,
854,de_vries,21,DEV,Nyck,de Vries,1995-02-06,
855,piastri,81,PIA,Oscar,Piastri,2001-04-06,
856,sargeant,2,SAR,Logan,Sargeant,2000-12-31,


In [14]:
conn_string = 'postgresql://admin:admin@pgdb/postgres'
engine = create_engine('postgresql://admin:admin@pgdb/postgres')
db = create_engine(conn_string)
conn = db.connect()
conn = psycopg2.connect(conn_string)
conn.autocommit = True
cursor = conn.cursor()

table_name = 'drivers'
merged_df.to_sql(table_name, engine, if_exists='append', index=False, schema='race_data')
conn.close()

In [15]:
constructors_csv=constructors_csv.drop(columns=['constructorId','url','nationality','constructorRef'])

In [16]:
constructors_csv

Unnamed: 0,name
0,McLaren
1,BMW Sauber
2,Williams
3,Renault
4,Toro Rosso
...,...
206,Manor Marussia
207,Haas F1 Team
208,Racing Point
209,AlphaTauri


In [17]:
conn_string = 'postgresql://admin:admin@pgdb/postgres'
engine = create_engine('postgresql://admin:admin@pgdb/postgres')
db = create_engine(conn_string)
conn = db.connect()
conn = psycopg2.connect(conn_string)
conn.autocommit = True
cursor = conn.cursor()

table_name = 'constructors'
constructors_csv.to_sql(table_name, engine, if_exists='append', index=False, schema='race_data')
conn.close()

In [18]:
status_csv=status_csv.drop(columns=['statusId'])
status_csv

Unnamed: 0,status
0,Finished
1,Disqualified
2,Accident
3,Collision
4,Engine
...,...
134,Damage
135,Debris
136,Illness
137,Undertray


In [19]:
conn_string = 'postgresql://admin:admin@pgdb/postgres'
engine = create_engine('postgresql://admin:admin@pgdb/postgres')
db = create_engine(conn_string)
conn = db.connect()
conn = psycopg2.connect(conn_string)
conn.autocommit = True
cursor = conn.cursor()

table_name = 'status'
status_csv.to_sql(table_name, engine, if_exists='append', index=False, schema='race_data')
conn.close()

# Now working on getting the rest of the attributes for results table

the remaining ones are now:

    event_id INT REFERENCES events(event_id),
    number INT,
    grid INT,
    position INT,
    points DECIMAL,
    laps INT,
    time VARCHAR(255),
    fastest_lap_time VARCHAR(255),
    rank INT,
    fastest_lap_speed DECIMAL,


Fetch season data from db and join race csv with season db data to get the seaaoson id first first.

In [20]:
race_csv

Unnamed: 0,raceId,year,round,circuitId,name,date,time,url,fp1_date,fp1_time,fp2_date,fp2_time,fp3_date,fp3_time,quali_date,quali_time,sprint_date,sprint_time
0,1,2009,1,1,Australian Grand Prix,2009-03-29,06:00:00,http://en.wikipedia.org/wiki/2009_Australian_G...,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
1,2,2009,2,2,Malaysian Grand Prix,2009-04-05,09:00:00,http://en.wikipedia.org/wiki/2009_Malaysian_Gr...,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
2,3,2009,3,17,Chinese Grand Prix,2009-04-19,07:00:00,http://en.wikipedia.org/wiki/2009_Chinese_Gran...,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
3,4,2009,4,3,Bahrain Grand Prix,2009-04-26,12:00:00,http://en.wikipedia.org/wiki/2009_Bahrain_Gran...,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
4,5,2009,5,4,Spanish Grand Prix,2009-05-10,12:00:00,http://en.wikipedia.org/wiki/2009_Spanish_Gran...,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1096,1116,2023,18,69,United States Grand Prix,2023-10-22,19:00:00,https://en.wikipedia.org/wiki/2023_United_Stat...,2023-10-20,17:30:00,2023-10-21,18:00:00,\N,\N,2023-10-20,21:00:00,2023-10-21,22:00:00
1097,1117,2023,19,32,Mexico City Grand Prix,2023-10-29,20:00:00,https://en.wikipedia.org/wiki/2023_Mexico_City...,2023-10-27,18:30:00,2023-10-27,22:00:00,2023-10-28,17:30:00,2023-10-28,21:00:00,\N,\N
1098,1118,2023,20,18,São Paulo Grand Prix,2023-11-05,17:00:00,https://en.wikipedia.org/wiki/2023_S%C3%A3o_Pa...,2023-11-03,14:30:00,2023-11-04,14:30:00,\N,\N,2023-11-03,18:00:00,2023-11-04,18:30:00
1099,1119,2023,21,80,Las Vegas Grand Prix,2023-11-19,06:00:00,https://en.wikipedia.org/wiki/2023_Las_Vegas_G...,2023-11-17,04:30:00,2023-11-17,08:00:00,2023-11-18,04:30:00,2023-11-18,08:00:00,\N,\N


In [21]:
df_db_seasons = pd.read_sql("SELECT * FROM race_data.season", engine)

In [22]:
merged_df_race_season = pd.merge(race_csv[(race_csv.year >=2021)], df_db_seasons, on='year', how='left')

In [23]:
merged_df_race_season.dtypes

raceId              int64
year                int64
round               int64
circuitId           int64
name               object
date               object
time               object
url                object
fp1_date           object
fp1_time           object
fp2_date           object
fp2_time           object
fp3_date           object
fp3_time           object
quali_date         object
quali_time         object
sprint_date        object
sprint_time        object
season_id           int64
championship_id     int64
dtype: object

In [24]:
### Though circuit id can be taken but we I am removing it for now because the circuit ids being entered here are not present in circuit table and hence breaking the ref integrity
events_df_csv = merged_df_race_season[['season_id','round', 'name','date']] 

In [25]:

new_column_names = {
        'round': 'race_round'
        ,'name': 'official_name'
        ,'circuitId': 'circuit_id'
    
    }
events_df_csv.rename(columns=new_column_names, inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  events_df_csv.rename(columns=new_column_names, inplace=True)


In [26]:
conn_string = 'postgresql://admin:admin@pgdb/postgres'
engine = create_engine('postgresql://admin:admin@pgdb/postgres')
db = create_engine(conn_string)
conn = db.connect()
conn = psycopg2.connect(conn_string)
conn.autocommit = True
cursor = conn.cursor()

table_name = 'events'
events_df_csv.to_sql(table_name, engine, if_exists='append', index=False, schema='race_data')
conn.close()

#### now join result_csv with merged_df_race_season on racec_id

In [27]:
merged_df_race_season_result = pd.merge(result_csv, merged_df_race_season, on='raceId', how='inner')

In [28]:
merged_df_race_season_result

Unnamed: 0,resultId,raceId,driverId,constructorId,number,grid,position,positionText,positionOrder,points,...,fp2_date,fp2_time,fp3_date,fp3_time,quali_date,quali_time,sprint_date,sprint_time,season_id,championship_id
0,24966,1052,1,131,44,2,1,1,1,25.0,...,2021-03-26,\N,2021-03-27,\N,2021-03-27,\N,\N,\N,72,1
1,24967,1052,830,9,33,1,2,2,2,18.0,...,2021-03-26,\N,2021-03-27,\N,2021-03-27,\N,\N,\N,72,1
2,24968,1052,822,131,77,3,3,3,3,16.0,...,2021-03-26,\N,2021-03-27,\N,2021-03-27,\N,\N,\N,72,1
3,24969,1052,846,1,4,7,4,4,4,12.0,...,2021-03-26,\N,2021-03-27,\N,2021-03-27,\N,\N,\N,72,1
4,24970,1052,815,9,11,0,5,5,5,10.0,...,2021-03-26,\N,2021-03-27,\N,2021-03-27,\N,\N,\N,72,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1195,26161,1114,848,3,23,13,\N,R,16,0.0,...,2023-09-22,06:00:00,2023-09-23,02:30:00,2023-09-23,06:00:00,\N,\N,74,1
1196,26162,1114,858,3,2,0,\N,R,17,0.0,...,2023-09-22,06:00:00,2023-09-23,02:30:00,2023-09-23,06:00:00,\N,\N,74,1
1197,26163,1114,840,117,18,17,\N,R,18,0.0,...,2023-09-22,06:00:00,2023-09-23,02:30:00,2023-09-23,06:00:00,\N,\N,74,1
1198,26164,1114,815,9,11,5,\N,R,19,0.0,...,2023-09-22,06:00:00,2023-09-23,02:30:00,2023-09-23,06:00:00,\N,\N,74,1


#### Now get the events table from db as it would have the events_id generated now. so join the events from dv with merged_df_race_season_result on season id. From the final merged df we can have a subset whihc will be our results df and then we can insert it to db


In [29]:
df_db_events = pd.read_sql("SELECT * FROM race_data.events", engine)

In [30]:
merged_df_race_season_result_dbevents=pd.merge(merged_df_race_season_result, df_db_events, on='season_id', how='left')

In [31]:
merged_df_race_season_result_dbevents.dtypes

resultId             int64
raceId               int64
driverId             int64
constructorId        int64
number              object
grid                 int64
position            object
positionText        object
positionOrder        int64
points             float64
laps                 int64
time_x              object
milliseconds        object
fastestLap          object
rank                object
fastestLapTime      object
fastestLapSpeed     object
statusId             int64
year                 int64
round                int64
circuitId            int64
name                object
date_x              object
time_y              object
url                 object
fp1_date            object
fp1_time            object
fp2_date            object
fp2_time            object
fp3_date            object
fp3_time            object
quali_date          object
quali_time          object
sprint_date         object
sprint_time         object
season_id            int64
championship_id      int64
e

In [32]:
results_db = merged_df_race_season_result_dbevents[['event_id','driverId', 'constructorId','number','grid','position', 'points','laps','time_x','fastestLapTime', 'rank','fastestLapSpeed','statusId']] 

In [33]:
results_db #.sort_values(by=['location','circuitRef', 'name'])

Unnamed: 0,event_id,driverId,constructorId,number,grid,position,points,laps,time_x,fastestLapTime,rank,fastestLapSpeed,statusId
0,1,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
1,3,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
2,4,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
3,5,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
4,6,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
52795,128,822,51,77,16,\N,0.0,7,\N,2:02.755,20,170.300,130
52796,129,822,51,77,16,\N,0.0,7,\N,2:02.755,20,170.300,130
52797,130,822,51,77,16,\N,0.0,7,\N,2:02.755,20,170.300,130
52798,131,822,51,77,16,\N,0.0,7,\N,2:02.755,20,170.300,130


In [34]:
results_db.dtypes

event_id             int64
driverId             int64
constructorId        int64
number              object
grid                 int64
position            object
points             float64
laps                 int64
time_x              object
fastestLapTime      object
rank                object
fastestLapSpeed     object
statusId             int64
dtype: object

In [35]:
results_db['position']=results_db['position'].replace(to_replace = '\\N', value = 0000)
results_db['time_x']=results_db['time_x'].replace(to_replace = '\\N', value = '+99:99.999')
results_db['fastestLapSpeed']=results_db['fastestLapSpeed'].replace(to_replace = '\\N', value = 0000)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  results_db['position']=results_db['position'].replace(to_replace = '\\N', value = 0000)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  results_db['time_x']=results_db['time_x'].replace(to_replace = '\\N', value = '+99:99.999')
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  results_db['fastestLapSpe

In [36]:
new_column_names = {
        'driverId': 'driver_id'
        ,'constructorId': 'constructor_id'
        ,'time_x': 'time'
        ,'fastestLapTime': 'fastest_lap_time'
        ,'fastestLapSpeed': 'fastest_lap_speed'
        ,'statusId': 'status_id'
    
    }
results_db.rename(columns=new_column_names, inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  results_db.rename(columns=new_column_names, inplace=True)


In [37]:
results_db = results_db.drop_duplicates()

In [38]:
results_db

Unnamed: 0,event_id,driver_id,constructor_id,number,grid,position,points,laps,time,fastest_lap_time,rank,fastest_lap_speed,status_id
0,1,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
1,3,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
2,4,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
3,5,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
4,6,1,131,44,2,1,25.0,56,1:32:03.897,1:34.015,4,207.235,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
52795,128,822,51,77,16,0,0.0,7,+99:99.999,2:02.755,20,170.300,130
52796,129,822,51,77,16,0,0.0,7,+99:99.999,2:02.755,20,170.300,130
52797,130,822,51,77,16,0,0.0,7,+99:99.999,2:02.755,20,170.300,130
52798,131,822,51,77,16,0,0.0,7,+99:99.999,2:02.755,20,170.300,130


In [39]:
conn_string = 'postgresql://admin:admin@pgdb/postgres'
engine = create_engine('postgresql://admin:admin@pgdb/postgres')
db = create_engine(conn_string)
conn = db.connect()
conn = psycopg2.connect(conn_string)
conn.autocommit = True
cursor = conn.cursor()

table_name = 'results'
results_db.to_sql(table_name, engine, if_exists='append', index=False, schema='race_data')
conn.close()

                                                             # End of the notebook