# Insert Data to MongoDB from Jupyter Notebook

How to use SQL Server Views and Stored Procedures, when creating dataframes:
- View is a virtual table based on the result-set of an SQL statement. A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database. You can add SQL statements and functions to a view and present the data as if the data were coming from one single table.
- Stored procedure is a prepared SQL code that you can save, so the code can be reused over and over again. So if you have an SQL query that you write over and over again, save it as a stored procedure, and then just call it to execute it. You can also pass parameters to a stored procedure, so that the stored procedure can act based on the parameter value(s) that is passed.

In this case we using two way to insert data to MongoDB:
1. To insert dataframe to records (or documents as it is called in MongoDB) into a collection, dataframe must first change to JSON document. This makes it easier to use the input method (dataframe rows are the corresponding collection records in the JSON file).
  - Create connection to MongoDB
  - Accessing database and collection
  - Insert the data inside a collection by using insert_many() method.
  - The parameter of the insert_many() method is a list containing dictionaries with the data you want to insert.


2. Create CSV file to importing data using the MongoDB compass (GUI). For importing the CSV file with following steps:
  - Open MongoDB compass
  - Create a new MongoDB database and collection
  - Select the created collection and then click on 'Add Data' button. Click 'Import File' on drop-down list.
  - Select file that you want to import (file type like CSV or JSON).

In [1]:
# Get necessary libraries
import pymongo
import dns
import json
import pyodbc
import sqlalchemy
from sqlalchemy.engine import URL
from sqlalchemy import create_engine
import pandas as pd
import numpy as np

#### MongoDatabase connection

In [2]:
# Establish connection to MongoDB
client = pymongo.MongoClient("mongodb://pasihintikka:pasihintikka@cluster0.bjhalyq.mongodb.net/?retryWrites=true&w=majority")

# Database and Collection
db = client["Sample_Onnettomuudet"]
col = db["vuos_onnett_paikka_osall"]

#### SQL Server connection

In [3]:
# Establish connection to SQL Server
conn = 'DRIVER={ODBC Driver 17 for SQL Server};server=DESKTOP-Q88A49I\SQLEXPRESS;database=Onnettomuudet;trusted_connection=Yes;'
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": conn})
# Create engine between python and database
engine = create_engine(connection_url)

# Database parameters
Database = 'Onnettomuudet'

#### Query for SQL Server by using db Views or Stored Procedures

In [4]:
# Database Views
#sqlcommand_v1 = 'SELECT * FROM vuos_maak_onnett;'
#sqlcommand_v2 = 'SELECT * FROM vuos_maak_onnett_tyyp;'
#sqlcommand_v3 = 'SELECT * FROM vuos_vak_onnett_olos;'
#sqlcommand_v4 = 'SELECT * FROM vuos_vak_onnett_paikka;'
#sqlcommand_v5 = 'SELECT * FROM vuos_onnett_paikka;'
sqlcommand_v6 = 'SELECT * FROM vuos_onnett_paikka_osall;'

# Database Procedures
vuosi = 2017
#sqlcommand_p1 = 'EXEC uspGetLoukk_Osall '+str(vuosi)
#sqlcommand_p2 = 'EXEC uspGetKuoll_Osall '+str(vuosi)
#sqlcommand_p3 = 'EXEC uspGetVakOnnett_Olos '+str(vuosi)
#sqlcommand_p4 = 'EXEC uspGetVakOnnett_Paikka '+str(vuosi)

#### Query for SQL Server by using unique query

In [5]:
#sqlcommand_q1 = 

#### Create selected Query and insert results to dataframe

In [6]:
query = pd.read_sql_query(sqlcommand_v6,con=engine)
df = pd.DataFrame(query)

In [7]:
df

Unnamed: 0,Onnett_id,Vuosi,Kuukausi,Viikonpäivä,Tunti,Vakavuus,Loukkaantuneet,Kuolleet,Osallisen_laji,Kuollut,...,Valoisuus,Sää,Lämpötila,Maakunta,Maak_Loc,Väestö,Kunta,Katuosoite,position.lat,position.lon
0,6422405,2009,5,Lauantai,12.0,Ei henkilövahinkoja,0,0,,,...,päivänvalo,kirkas,,Etelä-Pohjanmaa,FI-03,198502,Vimpeli,SÄNTINTIE 12,,
1,6422406,2011,4,Torstai,15.0,Ei henkilövahinkoja,0,0,henkilöauto,0.0,...,päivänvalo,vesisade,,Keski-Suomi,FI-08,271083,Kyyjärvi,KIRKKOTIE 2,62.491272,22.757845
2,6422407,2010,8,Torstai,2.0,Ei henkilövahinkoja,0,0,henkilöauto,0.0,...,tie valaistu,Ei arvoa,,Lappi,FI-10,183748,Kemi,VAPAUDENTIE X KAUPPATIE,62.788365,22.834446
3,6422408,2010,11,Perjantai,8.0,Ei henkilövahinkoja,0,0,henkilöauto,0.0,...,Ei arvoa,Ei arvoa,,Lappi,FI-10,183748,Kemi,PENTTILÄNTIE,62.794009,22.892707
4,6422409,2011,12,Lauantai,14.0,Ei henkilövahinkoja,0,0,,,...,Ei arvoa,Ei arvoa,,Keski-Suomi,FI-08,271083,Kyyjärvi,PATRUUNANTIE 10,63.159636,23.826183
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
629816,10211231,2021,11,Tiistai,11.0,Ei henkilövahinkoja,0,0,kuorma-auto,0.0,...,päivänvalo,kirkas,-15.0,Päijät-Häme,FI-16,205771,Lahti,VÄSTÄRÄKINKATU 14,63.098776,21.687995
629817,10211232,2021,11,Tiistai,16.0,Loukkaantumiseen johtanut,5,0,henkilöauto,0.0,...,pimeä (valaisematon),pilvipouta,-5.0,Päijät-Häme,FI-16,205771,Lahti,PALOSAARENTIE,63.114629,21.608269
629818,10211232,2021,11,Tiistai,16.0,Loukkaantumiseen johtanut,5,0,henkilöauto,0.0,...,pimeä (valaisematon),pilvipouta,-5.0,Päijät-Häme,FI-16,205771,Lahti,PALOSAARENTIE,63.114629,21.608269
629819,10211233,2021,11,Tiistai,17.0,Ei henkilövahinkoja,0,0,henkilöauto,0.0,...,tie valaistu,lumisade,-4.0,Päijät-Häme,FI-16,205771,Lahti,VÄSTERLEDEN X LAPPFJÄRDSVÄGEN,63.663510,22.689106


#### Create records from dataframe and insert records to Mongo database

In [10]:
#data = df.to_dict(orient='records')

In [11]:
#col.insert_many(data)

In [21]:
records = json.loads(df.T.to_json()).values()

In [22]:
# IMPORTANT!
# Change the Collection name 'db.<Collection_name>.insert_many(records)'

db.vuos_onnett_paikka_osall.insert_many(records)

ServerSelectionTimeoutError: cluster0.bjhalyq.mongodb.net:27017: [Errno 11001] getaddrinfo failed, Timeout: 30s, Topology Description: <TopologyDescription id: 635a75c46cd6d162ed1fce06, topology_type: Unknown, servers: [<ServerDescription ('cluster0.bjhalyq.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('cluster0.bjhalyq.mongodb.net:27017: [Errno 11001] getaddrinfo failed')>]>

#### Create records into csv, for inserting by MongoDB Compass

In [8]:
# Create result csv file and save it
df.to_csv('../datasets/onnettomuus/db/vuos_onnett_paikka_osall.csv', sep=';', encoding='utf-8', index=False)