### Webscrapping with Pandas pd.read_html()  + ODBC on SQL Server with pyodbc

#### Source : Wikipedia / Sport climbing at 2024 IFSC Climbing European Championships

The 2024 IFSC Climbing European Championships, the 15th edition, was held in Villars-sur-Ollon, Switzerland from 27 August to 1 September 2024. 

The competition climbing event consisted of lead, speed, bouldering, and combined events

In [3]:
import numpy as np
import pandas as pd
# Beautifulsoup already installed into Anaconda environment

# Librairy PyODBC for SQL Server connection
import os
import pyodbc
# setup SQL Server connection session
server = '-PCSJN\DATAVIZ'  
database = 'Climbing_Staging'
connection_string= (
                    'DRIVER={SQL Server};SERVER='
                     + server
                     + ';DATABASE='
                     + database 
                     + ';Trusted_Connection=yes'                  
                    )

# Function for injecting dataframe content into SQL Server Table
def template_SQL_insert_values(table_name,dict):
    
    columns = "("
    for k in dict.keys():
        columns += str(k) + ','
    columns = columns[:-1] ### remove last comma
    columns += ")"
    
    req = f"""INSERT INTO {table_name} {columns} VALUES {tuple(dict.values())}"""
    return req

In [4]:
### url
Wiki_Climbing_2024_IFSC_European_Championship = "https://en.wikipedia.org/wiki/2024_IFSC_Climbing_European_Championships#:~:text=The%202024%20IFSC%20Climbing%20European,%2C%20bouldering%2C%20and%20combined%20events."
### récuperer toutes les tables html vu sur le site (url)
Dataset_Wiki_Climbing_2024_IFSC_European_Championship = pd.read_html(Wiki_Climbing_2024_IFSC_European_Championship)
type(Dataset_Wiki_Climbing_2024_IFSC_European_Championship), len(Dataset_Wiki_Climbing_2024_IFSC_European_Championship)

(list, 4)

##### on récupère 1 list de 4 élts

In [5]:
type(Dataset_Wiki_Climbing_2024_IFSC_European_Championship[0])

pandas.core.frame.DataFrame

##### > Chaque élt est un DataFrame Pandas
##### ? Quel est le contenu chacun des 9 dataframes ? Lesquelles nous interessent ?

In [6]:
Dataset_Wiki_Climbing_2024_IFSC_European_Championship[0]

Unnamed: 0,2024 IFSC Climbing European Championships,2024 IFSC Climbing European Championships.1
0,Venue,Place du Rendez-Vous
1,Location,"Villars-sur-Ollon, Switzerland"
2,Date,27 August – 1 September
3,Website,ifsc-climbing.org
4,← 20222026 →,← 20222026 →


######## [0] is dataset of summarized data about Sport Climbing at 2024 IFSC Climbing European Championships dataset > NOT to Discard

In [7]:
Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1]

Unnamed: 0,Event,Gold,Silver,Bronze
0,Men's boulder[2],Sam Avezou France,Maximillian Milne Great Britain,Dayan Akhtar Great Britain
1,Men's lead[3],Sascha Lehmann Switzerland,Sam Avezou France,Guillermo Peinado Spain
2,Men's speed[4],Ludovico Fossali Italy,Matteo Zurloni Italy,Erik Noya Cardona Spain
3,Men's combined (boulder & lead)[5],Sam Avezou France,Sascha Lehmann Switzerland,Jonas Utelli Switzerland
4,Women's boulder[6],Naïlé Meignan France,Ayala Kerem Israel,Agathe Calliet France
5,Women's lead[7],Laura Rogora Italy,Jenya Kazbekova Ukraine,Lynn van der Meer Netherlands
6,Women's speed[8],Natalia Kałucka Poland,Patrycja Chudziak Poland,Giulia Randi Italy
7,Women's combined (boulder & lead)[9],Laura Rogora Italy,Jenya Kazbekova Ukraine,Zélia Avezou France


######## [1] about Men and Women Medal winner full names and Nations in BOULDERING, LEAD, SPEED and COMBINED (boulder + lead) at 2024 European Championship > NOT to Discard.

In [8]:
Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].columns

Index(['Event', 'Gold', 'Silver', 'Bronze'], dtype='object')

In [9]:
# Replace 'Event' column by 'Year' to fit with other tables. Result for Gender and Discipline are in a specific table
Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].columns =  ['Year','Gold','Silver','Bronze']
Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1]

Unnamed: 0,Year,Gold,Silver,Bronze
0,Men's boulder[2],Sam Avezou France,Maximillian Milne Great Britain,Dayan Akhtar Great Britain
1,Men's lead[3],Sascha Lehmann Switzerland,Sam Avezou France,Guillermo Peinado Spain
2,Men's speed[4],Ludovico Fossali Italy,Matteo Zurloni Italy,Erik Noya Cardona Spain
3,Men's combined (boulder & lead)[5],Sam Avezou France,Sascha Lehmann Switzerland,Jonas Utelli Switzerland
4,Women's boulder[6],Naïlé Meignan France,Ayala Kerem Israel,Agathe Calliet France
5,Women's lead[7],Laura Rogora Italy,Jenya Kazbekova Ukraine,Lynn van der Meer Netherlands
6,Women's speed[8],Natalia Kałucka Poland,Patrycja Chudziak Poland,Giulia Randi Italy
7,Women's combined (boulder & lead)[9],Laura Rogora Italy,Jenya Kazbekova Ukraine,Zélia Avezou France


In [10]:
# Split Men/Women and Disciplines into differents df
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Bouldering_Df1 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].loc[0,:]
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Lead_Df1 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].loc[1,:]
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Speed_Df1 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].loc[2,:]
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Combined_Df1 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].loc[3,:]
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Bouldering_Df1 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].loc[4,:]
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Lead_Df1 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].loc[5,:]
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Speed_Df1 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].loc[6,:]
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Combined_Df1 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[1].loc[7,:]

In [11]:
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Bouldering_Df1

Year                      Men's boulder[2]
Gold                    Sam Avezou  France
Silver    Maximillian Milne  Great Britain
Bronze         Dayan Akhtar  Great Britain
Name: 0, dtype: object

In [12]:
# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Bouldering_Df1
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Year VARCHAR(100),
                                        Gold VARCHAR(100),
                                        Silver VARCHAR(100),
                                        Bronze VARCHAR(100),                                                                   
                                   ) """

# script d'ajout des data d'une table ds SQL
# le pays n'est pas ajouté, pour matcher avec les autres tables, et 2024 est ajouté ds champ Year
populate_table = """INSERT INTO Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Bouldering_Df1 VALUES ('2024','Sam Avezou','Maximillian Milne','Dayan Akhtar')
                    """
                                  
# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    cursor.execute(populate_table) 
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [13]:
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Lead_Df1

Year                    Men's lead[3]
Gold      Sascha Lehmann  Switzerland
Silver             Sam Avezou  France
Bronze       Guillermo Peinado  Spain
Name: 1, dtype: object

In [14]:
# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Lead_Df1
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Year VARCHAR(100),
                                        Gold VARCHAR(100),
                                        Silver VARCHAR(100),
                                        Bronze VARCHAR(100),                                                                   
                                   ) """

# script d'ajout des data d'une table ds SQL
# le pays n'est pas ajouté, pour matcher avec les autres tables, et 2024 est ajouté ds champ Year
populate_table = """INSERT INTO Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Lead_Df1 VALUES ('2024','Sascha Lehmann','Sam Avezou','Guillermo Peinado')
                    """
                                  
# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    cursor.execute(populate_table) 
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [15]:
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Speed_Df1

Year                Men's speed[4]
Gold       Ludovico Fossali  Italy
Silver       Matteo Zurloni  Italy
Bronze    Erik Noya Cardona  Spain
Name: 2, dtype: object

In [16]:
# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Speed_Df1
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Year VARCHAR(100),
                                        Gold VARCHAR(100),
                                        Silver VARCHAR(100),
                                        Bronze VARCHAR(100),                                                                   
                                   ) """

# script d'ajout des data d'une table ds SQL
# le pays n'est pas ajouté, pour matcher avec les autres tables, et 2024 est ajouté ds champ Year
populate_table = """INSERT INTO Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Speed_Df1 VALUES ('2024','Ludovico Fossali','Matteo Zurloni','Erik Noya Cardona')
                    """
                                  
# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    cursor.execute(populate_table) 
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [17]:
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Combined_Df1

Year      Men's combined (boulder & lead)[5]
Gold                      Sam Avezou  France
Silver           Sascha Lehmann  Switzerland
Bronze             Jonas Utelli  Switzerland
Name: 3, dtype: object

In [18]:
# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Combined_Df1
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Year VARCHAR(100),
                                        Gold VARCHAR(100),
                                        Silver VARCHAR(100),
                                        Bronze VARCHAR(100),                                                                   
                                   ) """

# script d'ajout des data d'une table ds SQL
# le pays n'est pas ajouté, pour matcher avec les autres tables, et 2024 est ajouté ds champ Year
populate_table = """INSERT INTO Wiki_IFSC_Climbing_European_Championship_2024_Medal_Men_Combined_Df1 VALUES ('2024','Sam Avezou','Sascha Lehmann','Jonas Utelli')
                    """
                                  
# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    cursor.execute(populate_table) 
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [19]:
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Bouldering_Df1

Year          Women's boulder[6]
Gold       Naïlé Meignan  France
Silver       Ayala Kerem  Israel
Bronze    Agathe Calliet  France
Name: 4, dtype: object

In [20]:
# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Bouldering_Df1
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Year VARCHAR(100),
                                        Gold VARCHAR(100),
                                        Silver VARCHAR(100),
                                        Bronze VARCHAR(100),                                                                   
                                   ) """

# script d'ajout des data d'une table ds SQL
# le pays n'est pas ajouté, pour matcher avec les autres tables, et 2024 est ajouté ds champ Year
populate_table = """INSERT INTO Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Bouldering_Df1 VALUES ('2024','Naïlé Meignan','Ayala Kerem','Agathe Calliet')
                    """
                                  
# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    cursor.execute(populate_table) 
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [21]:
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Lead_Df1

Year                     Women's lead[7]
Gold                 Laura Rogora  Italy
Silver          Jenya Kazbekova  Ukraine
Bronze    Lynn van der Meer  Netherlands
Name: 5, dtype: object

In [22]:
# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Lead_Df1
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Year VARCHAR(100),
                                        Gold VARCHAR(100),
                                        Silver VARCHAR(100),
                                        Bronze VARCHAR(100),                                                                   
                                   ) """

# script d'ajout des data d'une table ds SQL
# le pays n'est pas ajouté, pour matcher avec les autres tables, et 2024 est ajouté ds champ Year
populate_table = """INSERT INTO Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Lead_Df1 VALUES ('2024','Laura Rogora','Jenya Kazbekova','Lynn van der Meer')
                    """
                                  
# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    cursor.execute(populate_table) 
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [23]:
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Speed_Df1

Year               Women's speed[8]
Gold        Natalia Kałucka  Poland
Silver    Patrycja Chudziak  Poland
Bronze          Giulia Randi  Italy
Name: 6, dtype: object

In [24]:
# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Speed_Df1
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Year VARCHAR(100),
                                        Gold VARCHAR(100),
                                        Silver VARCHAR(100),
                                        Bronze VARCHAR(100),                                                                   
                                   ) """

# script d'ajout des data d'une table ds SQL
# le pays n'est pas ajouté, pour matcher avec les autres tables, et 2024 est ajouté ds champ Year
populate_table = """INSERT INTO Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Speed_Df1 VALUES ('2024','Natalia Kałucka','Patrycja Chudziak','Giulia Randi')
                    """
                                  
# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    cursor.execute(populate_table) 
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [25]:
Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Combined_Df1

Year      Women's combined (boulder & lead)[9]
Gold                       Laura Rogora  Italy
Silver                Jenya Kazbekova  Ukraine
Bronze                    Zélia Avezou  France
Name: 7, dtype: object

In [26]:
# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Combined_Df1
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Year VARCHAR(100),
                                        Gold VARCHAR(100),
                                        Silver VARCHAR(100),
                                        Bronze VARCHAR(100),                                                                   
                                   ) """

# script d'ajout des data d'une table ds SQL
# le pays n'est pas ajouté, pour matcher avec les autres tables, et 2024 est ajouté ds champ Year
populate_table = """INSERT INTO Wiki_IFSC_Climbing_European_Championship_2024_Medal_Women_Combined_Df1 VALUES ('2024','Laura Rogora','Jenya Kazbekova','Zélia Avezou')
                    """
                                  
# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    cursor.execute(populate_table) 
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [27]:
Dataset_Wiki_Climbing_2024_IFSC_European_Championship[2]

Unnamed: 0,Rank,Nation,Gold,Silver,Bronze,Total
0,1,France (FRA),3,1,2,6
1,2,Italy (ITA),3,1,1,5
2,3,Switzerland (SUI),1,1,1,3
3,4,Poland (POL),1,1,0,2
4,5,Ukraine (UKR),0,2,0,2
5,6,Great Britain (GBR),0,1,1,2
6,7,Israel (ISR),0,1,0,1
7,8,Spain (ESP),0,0,2,2
8,9,Netherlands (NED),0,0,1,1
9,Totals (9 entries),Totals (9 entries),8,8,8,24


[2] is about Nation Medal winners at 2024 European Championship > NOT to Discard.

In [28]:
Dataset_Wiki_Climbing_2024_IFSC_European_Championship[2].dtypes

Rank      object
Nation    object
Gold       int64
Silver     int64
Bronze     int64
Total      int64
dtype: object

In [29]:
Wiki_IFSC_Climbing_European_Championship_2024_Countries_Total_Ranking_Df2 = Dataset_Wiki_Climbing_2024_IFSC_European_Championship[2]

# script de création d'une table ds SQL 
create_table = """Create TABLE Wiki_IFSC_Climbing_European_Championship_2024_Countries_Total_Ranking_Df2
                                   ( 
                                        id INT IDENTITY PRIMARY KEY,
                                        Rank VARCHAR(50),
                                        Nation VARCHAR(50),
                                        Gold INT,
                                        Silver INT,
                                        Bronze INT,
                                        Total INT                                                                   
                                   ) """

# launch session
cnxn = pyodbc.connect(connection_string,autocommit=True) # no cursor.commit as Autocommit already
cursor = cnxn.cursor() 

try: 
    # load script "create " 
    cursor.execute(create_table) 
    
    # population de la table ds SQL
    for i,row in Wiki_IFSC_Climbing_European_Championship_2024_Countries_Total_Ranking_Df2.iterrows(): 
        row_dict=row.to_dict()
        temp = template_SQL_insert_values('Wiki_IFSC_Climbing_European_Championship_2024_Countries_Total_Ranking_Df2',row_dict)
        cursor.execute(temp)
   
except Exception as err: 
    print('err:',err) 
try: 
    cursor.close()  # close session
except Exception as err: 
    print('err:',err)

In [30]:
Dataset_Wiki_Climbing_2024_IFSC_European_Championship[3]

Unnamed: 0,vteEuropean championships in 2024,vteEuropean championships in 2024.1
0,Summer sports & indoor sports,Aquatic sports Athletics Archery indoor outdoo...
1,Winter sports,Biathlon Bobsleigh / Skeleton Curling Figure s...
2,Cue & mind sports,Chess Darts
3,Motor sports,Formula Regional Le Mans Series Motocross Rall...


[3] > To Discard