# <a id='toc1_'></a>[Load to SQL](#toc0_)

**Table of contents**<a id='toc0_'></a>    
- [Load to SQL](#toc1_)    
  - [Create neuropapers_db](#toc1_1_)    
  - [Fill neuropapers_db](#toc1_2_)    
    - [universities](#toc1_2_1_)    
    - [countries](#toc1_2_2_)    
    - [publications](#toc1_2_3_)    
    - [journals](#toc1_2_4_)    
    - [affiliations](#toc1_2_5_)    
    - [first_author](#toc1_2_6_)    
    - [last_author](#toc1_2_7_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

In [1]:
# %pip install SQLAlchemy==1.4.36
# %pip install pymysql

In [49]:
# Libraries
import pandas as pd

# SQLAlchemy ✨
from sqlalchemy import create_engine

## <a id='toc1_1_'></a>[Create neuropapers_db](#toc0_)

In [2]:
with open('../pswd_mysql.txt', 'r') as file:
    paswd = file.read().strip()

In [3]:
str_conn = f'mysql+pymysql://root:{paswd}@localhost:3306'
cursor = create_engine(str_conn)

In [4]:
cursor.execute('DROP DATABASE IF EXISTS neuropapers_db;')
cursor.execute('CREATE DATABASE neuropapers_db;')

<sqlalchemy.engine.cursor.LegacyCursorResult at 0x1cff9ffa190>

In [5]:
str_conn = f'mysql+pymysql://root:{paswd}@localhost:3306/neuropapers_db'
cursor = create_engine(str_conn)

___
## <a id='toc1_2_'></a>[Fill neuropapers_db](#toc0_)
### <a id='toc1_2_1_'></a>[universities](#toc0_)

In [16]:
universities = pd.read_csv('../data/universities_db.csv')
universities.shape

(1498, 8)

In [17]:
universities.columns

Index(['Unnamed: 0', 'Rank_2024', 'Institution_Name', 'Country', 'Location',
       'Academic_Reputation', 'Latitude', 'Longitude'],
      dtype='object')

In [18]:
column_names = ['index', 'rank_2024', 'institution_name', 'country_code', 'country',
       'academic_reputation', 'latitude', 'longitude']

universities.columns = column_names

In [19]:
universities.drop('index', axis=1, inplace=True)
universities.head(2)

Unnamed: 0,rank_2024,institution_name,country_code,country,academic_reputation,latitude,longitude
0,1,Massachusetts Institute of Technology (MIT),US,United States,100.0,42.360091,-71.09416
1,2,University of Cambridge,UK,United Kingdom,100.0,55.378051,-3.435973


In [20]:
# Load to SQL

universities.to_sql(name='universities',     
                    con=cursor,         
                    if_exists='replace',  
                    index=True)

1498

### <a id='toc1_2_2_'></a>[countries](#toc0_)

In [39]:
countries = pd.read_csv('../data/countries_db.csv')
countries.shape

(249, 57)

In [41]:
countries.drop(['Unnamed: 0'], axis=1, inplace=True)

In [43]:
column_names = ['index', 'country', 'official_state_name', 'sovereignty',
       'alpha_2_code', 'alpha_3_code', 'mln_2010', 'mln_2011', 'mln_2012',
       'mln_2013', 'mln_2014', 'mln_2015', 'mln_2016', 'mln_2017', 'mln_2018',
       'mln_2019', 'mln_2020', 'mln_2021', 'mln_2022', 'gdp_2010', 'gdp_2011',
       'gdp_2012', 'gdp_2013', 'gdp_2014', 'gdp_2015', 'gdp_2016', 'gdp_2017',
       'gdp_2018', 'gdp_2019', 'gdp_2020', 'gdp_2021', 'gdp_2022', 'tot_2010',
       'tot_2011', 'tot_2012', 'tot_2013', 'tot_2014', 'tot_2015', 'tot_2016',
       'tot_2017', 'tot_2018', 'tot_2019', 'tot_2020', 'tot_2021', 'wom_2010',
       'wom_2011', 'wom_2012', 'wom_2013', 'wom_2014', 'wom_2015', 'wom_2016',
       'wom_2017', 'wom_2018', 'wom_2019', 'wom_2020', 'wom_2021']

countries.columns = column_names

In [48]:
countries.head(2)

Unnamed: 0,index,country,official_state_name,sovereignty,alpha_2_code,alpha_3_code,mln_2010,mln_2011,mln_2012,mln_2013,...,wom_2012,wom_2013,wom_2014,wom_2015,wom_2016,wom_2017,wom_2018,wom_2019,wom_2020,wom_2021
0,4,Afghanistan,The Islamic Republic of Afghanistan,UN member state,AF,AFG,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,248,Åland Islands,Åland,Finland,AX,ALA,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [45]:
# Load to SQL

countries.to_sql(name='countries',     
                 con=cursor,         
                 if_exists='replace',  
                 index=False)

249

### <a id='toc1_2_3_'></a>[publications](#toc0_)

### <a id='toc1_2_4_'></a>[journals](#toc0_)

### <a id='toc1_2_5_'></a>[affiliations](#toc0_)

### <a id='toc1_2_6_'></a>[first_author](#toc0_)

### <a id='toc1_2_7_'></a>[last_author](#toc0_)

In [46]:
# show the tables of the database
cursor.execute('show TABLES;').fetchall()

[('countries',), ('country_codes',), ('universities',)]

In [47]:
# drop country_codes