# **Execute Query using the NEIVA Database**

## **1. Setting up the NEIVA database in the Google Colab environment:**

In [None]:
!pip install mysql-connector-python # Install the necessary package to connect Python with MySQL databases.
!pip install pubchempy
!apt-get update
!pip install pymysql
!apt-get -y install mysql-server    # Install the MySQL server on the Colab environment.
!service mysql start                # With MySQL install, this starts the server.

# Setting the password. Here 'root' is used as password.

!mysql -e "ALTER USER 'root'@'localhost' IDENTIFIED WITH 'mysql_native_password' BY 'root';FLUSH PRIVILEGES;"

In [None]:
# Remove the existing NEIVA repository if it exists.
!rm -rf NEIVA
# Download the NEIVA repository from GitHub
!git clone https://github.com/NEIVA-BB-emissions-Inventory/NEIVA.git

In [None]:
# Check if the repository is downloaded by listing its contents.
!ls NEIVA

In [None]:
!mysql -u root -proot -e "drop database backend_db"
!mysql -u root -proot -e "drop database primary_db"
!mysql -u root -proot -e "drop database raw_db"
!mysql -u root -proot -e "drop database legacy_db"
!mysql -u root -proot -e "drop database neiva_output_db"

In [None]:
# Initialize MySQL databases and import data from NEIVA SQL files.
!mysql -u root -proot -e "CREATE DATABASE IF NOT EXISTS backend_db"
!mysql -u root -proot backend_db < NEIVA/data/backend_db.sql
!mysql -u root -proot -e "CREATE DATABASE IF NOT EXISTS legacy_db"
!mysql -u root -proot legacy_db < NEIVA/data/legacy_db.sql
!mysql -u root -proot -e "CREATE DATABASE IF NOT EXISTS neiva_output_db"
!mysql -u root -proot neiva_output_db < NEIVA/data/neiva_output_db.sql
!mysql -u root -proot -e "CREATE DATABASE IF NOT EXISTS primary_db"
!mysql -u root -proot primary_db < NEIVA/data/primary_db.sql
!mysql -u root -proot -e "CREATE DATABASE IF NOT EXISTS raw_db"
!mysql -u root -proot raw_db < NEIVA/data/raw_db.sql

## **2. Viewing the databases and their contents using MySQL syntax:**

In [None]:
!mysql -u root -proot -e "show databases";

+--------------------+
| Database           |
+--------------------+
| backend_db         |
| information_schema |
| legacy_db          |
| mysql              |
| neiva_output_db    |
| performance_schema |
| primary_db         |
| raw_db             |
| sys                |
+--------------------+


In [None]:
!mysql -u root -proot -e "use backend_db; show tables;"

+---------------------------------------+
| Tables_in_backend_db                  |
+---------------------------------------+
| bkdb_compound_flaming_combustion_type |
| bkdb_correction_factor                |
| bkdb_fc_calc_simple                   |
| bkdb_fc_calc_specific                 |
| bkdb_imp_comList_for_calc             |
| bkdb_info_efcol                       |
| bkdb_info_rdb_ldb                     |
| bkdb_info_table_name                  |
| bkdb_nmog_LumpCom_altName             |
| bkdb_nmog_LumpedCom                   |
| bkdb_nmog_MultLumCom                  |
| bkdb_nmog_MultLumCom_slc_id           |
| bkdb_nmog_MultLumCom_slc_id_altName   |
| bkdb_pm_order_seq                     |
| chem_property_h15isomers              |
| chem_property_inchi                   |
| chem_property_lumpCom                 |
| chem_property_lumpCom_spec            |
| info_efcol_processed_data             |
| property_surrogate_info               |
+---------------------------------

In [None]:
!mysql -u root -proot -e "use primary_db; select * from pdb_travis23 where pollutant_category='inorganic gas';"

+---------+----------+----------------------+--------------------+------------------+------------------+---------------------+--------------------------+-------------------+------------------+-----------------------+-----------------------+------------------------+-------------------------------------------+
| mm      | formula  | compound             | pollutant_category | EF_corn_travis23 | EF_rice_travis23 | EF_soybean_travis23 | EF_winter_wheat_travis23 | EF_slash_travis23 | EF_pile_travis23 | EF_shrubland_travis23 | EF_grassland_travis23 | EF_blackwater_travis23 | id                                        |
+---------+----------+----------------------+--------------------+------------------+------------------+---------------------+--------------------------+-------------------+------------------+-----------------------+-----------------------+------------------------+-------------------------------------------+
|   28.01 | CO       | Carbon Monoxide      | inorganic gas      |    

In [None]:
!mysql -u root -proot -e "use primary_db; select * from pdb_trf_hodgson18;"

+--------+---------+-----------------+--------------------+-----------------------+--------------------------------+---------------------------------------------+
| mm     | formula | compound        | pollutant_category | EF_rondonia_hodgson18 | EF_tocantins_cerrado_hodgson18 | id                                          |
+--------+---------+-----------------+--------------------+-----------------------+--------------------------------+---------------------------------------------+
| 44.009 | CO2     | carbon dioxide  | inorganic gas      |                  1447 |                           1711 | InChI=1S/CO2/c2-1-3                         |
|  28.01 | CO      | carbon monoxide | inorganic gas      |                   237 |                             74 | InChI=1S/CO/c1-2                            |
| 16.043 | CH4     | methane         | methane            |                  5.17 |                           2.23 | InChI=1S/CH4/h1H4                           |
|   NULL | NULL    | O

In [None]:
!mysql -u root -proot -e "use primary_db; select * from pdb_p_stockwell16;"

+---------+---------+-------------------------------+---------------------+---------------------+---------------------------------------------------------------+
| mm      | formula | compound                      | pollutant_category  | EF_peat_stockwell16 | id                                                            |
+---------+---------+-------------------------------+---------------------+---------------------+---------------------------------------------------------------+
|  44.009 | CO2     | carbon dioxide                | inorganic gas       |           1564.1813 | InChI=1S/CO2/c2-1-3                                           |
|   28.01 | CO      | carbon monoxide               | inorganic gas       |            290.5111 | InChI=1S/CO/c1-2                                              |
|  16.043 | CH4     | methane                       | methane             |              9.5078 | InChI=1S/CH4/h1H4                                             |
|   2.016 | H2      | dihydr

In [None]:
!mysql -u root -proot -e "use neiva_output_db; show tables;"

+---------------------------+
| Tables_in_neiva_output_db |
+---------------------------+
| Integrated_EF             |
| Processed_EF              |
| Property_Surrogate        |
| Recommended_EF            |
+---------------------------+


In [None]:
!mysql -u root -proot -e "use neiva_output_db; select * from Integrated_EF where pollutant_category='inorganic gas';"

+-------+----------+----------------------------+--------------------+----------------------------+------------------------------------+-------------------------+-------------------------+---------------------+------------------------+-----------------------+--------------------------+----------------------+----------------------------------+----------------------------------+-----------------------+----------------------+--------------------------+---------------------+-------------------+--------------------------------+------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------+------------------------------------------+-----------------------------------+----------------------------------+----------------------------------+----------------------------------+------------------------------------+--------------------------------------------+-----------

In [None]:
!mysql -u root -proot -e "use neiva_output_db; select * from Property_Surrogate where S07='ARO2' and kOH>1e-10;"

+------+---------+---------------------------+----------------------------+------+------+-------+-------+---------+--------------------+------------------------+----------+------------------+--------------------+---------+---------------------------------+-------------------+----------+----------------------------+----------+--------------------+---------------------+---------------------------------------------------------+-------------------+
| mm   | formula | compound                  | smile                      | S07  | S07T | S18B  | S22   | MOZT1   | kOH                | kOH_ref                | ko3_exp  | kno3_exp         | vp_nannoolal       | vp      | vp_ref                          | cstar             | hc_exp   | hc_exp_ref                 | hc_est   | OCratio            | oxidation_state     | id                                                      | geos_chem_species |
+------+---------+---------------------------+----------------------------+------+------+-------+-----

## **3. Download tables**

## **3.1 Import the 'neivapy' package and other essential python libraries**

In [None]:
import NEIVA.neivapy as nv
import pandas as pd
from sqlalchemy import text
from google.colab import files

## **3.2 Connect the databases**

In [None]:
bk_db=nv.connect_db('backend_db')
primary_db=nv.connect_db('primary_db')
raw_db=nv.connect_db('raw_db')
legacy_db=nv.connect_db('legacy_db')
neiva_output_db=nv.connect_db('neiva_output_db')

## **3.3 Read tables as dataframe and download**

In [None]:
dd=pd.read_sql(text('select * from Recommended_EF'), con=neiva_output_db)

In [None]:
dd.to_csv('example.csv', index=False)

In [None]:
!ls

example.csv  NEIVA  sample_data


In [None]:
files.download('example.csv')

In [None]:
# download tables from databases
#__________ Specify database name and connection ____
db_name='legacy_db'
db_con=legacy_db
#____________________________________________________
tbl_ll=nv.get_table_name(db_name)
tbl_ll

In [None]:
tbl_ll=tbl_ll[:2]

In [None]:
for table in tbl_ll:
    dd=pd.read_sql(text('select * from '+table), con=db_con)
    dd.to_csv(table+'.csv', index=False)
    files.download(table+'.csv')
    # Process column names
    print('Table name :'+table)
    print('column names :'+','.join(dd.columns))