## Load Customer Data to Mongo Collection

Develop a function to store the customer data from a file into Mongo DB Collection.
* Let us go through the details about the actual usage.
  * An eCommerce platform will be a mobile app or web app. Using eCommerce platform we typically sell products to the customers. The data will be typically in RDBMS database such as Postgres, MySQL, Oracle, SQL Server etc.
  * However, the eCommerce company who runs the platform want to run the loyalty program. For that they build an application using MongoDB database.
  * Now we need to integrate the customer data in RDBMS database into the MongoDB database built for loyalty program application.
  * We can either integrate data between the eCommerce Platform and loyalty program in batch mode or real time.
  * Batch mode means the data from the eCommerce Platform database will be sent once every day in the form of files and we should have a scheduled job which will load this data into the target MongoDB database built for loyalty program.
* Let us assume that data is sent every day into the location **/data/ecomm/customers** for previous day.
* Here is the file naming convention - **part-00000**. The name of the file will be incremented for each of the new file added with incremental data.
* For now, we have customer data available under **/data/ecomm/customers/part-00000**.
* Make sure to understand characteristics of the data.
* Use appropriate Pandas API to read the data.
* Use MongoDB Bulk Insert to insert the data into Mongo Collection.
* Use this piece of code to determine the **database name** and **user name** you are supposed to use.

In [None]:
import getpass
username = getpass.getuser()

db_name = f'{username}_scratch_db'
db_user = f'{username}_scratch_user'
db_host = 'pylabsmd.itversity.com'
db_port = 27017

* Use this piece of code to get the password from the **/home/itversity/.jupyterenv** file. 

In [None]:
import configparser

config = configparser.ConfigParser()
config.read('/home/itversity/.jupyterenv')

db_pass = config['DEFAULT']['MONGO_SCRATCH_PASS']

* Develop multiple functions which will read the data from **/data/ecomm/customers/part-00000** and then write to a collection in the database called as **customers**.
  * Step 1: Create Mongo Client
  * Step 2: Read the data from file to list of dicts
  * Step 3: Write list of dicts to Mongo Collection
* You neeed to modularize the code.

### Step 1: Create Mongo Client

Develop a function by name **get_mongo_client** which takes all the arguments to build a client and return client object.

In [None]:
# Your code should go here
# The function should return the mongo client object.

import pymongo
def get_mongo_client(db_host, db_port, db_user, db_pass, db_auth_source='admin'):
    # Make sure to pass authSource argument while creatig mongo client

    return client

* Validate **get_mongo_client** using this piece of code. You might or might not see the collections in the database.

In [None]:
client = get_mongo_client(
    db_host=db_host,
    db_port=db_port,
    db_user=db_user,
    db_pass=db_pass,
    db_auth_source='admin'
)

client[f'{db_name}'].list_collection_names()

### Step 2: Read the data from file to list of dicts

Develop a function read the data from the file at the path passed and return list of dicts.

In [None]:
# Your code should go here
# Make sure to return the list of dicts

import pandas as pd
def read_file_into_collection(file_path):

    return data_list

* Validate **read_file_into_collection** using this piece of code.
* The file contain 20 records and also you should be able to see couple of records as output.

In [None]:
data_list = read_file_into_collection('/data/ecomm/customers/part-00000')

In [None]:
len(data_list)

In [None]:
data_list[:2]

### Step 3: Write list of dicts to Mongo Collection

Develop a function to write the list of dicts into Mongo collection by name **customers** in chunks.

In [None]:
# Your code should go here
# In each database insert, it should populate the number records passed to chunk_size
def write_to_customers(data_list, collection_name, mongo_client, chunk_size=6):

    return

* Validate **write_to_customers** using this piece of code.

In [None]:
client[f'{db_name}'].list_collection_names()

In [None]:
cust_collection = client[f'{db_name}'].get_collection('customers')

In [None]:
# Deleting all the documents from customers collection
cust_collection.delete_many({})

In [None]:
# This should return 0
cust_collection.count_documents({})

In [None]:
write_to_customers(data_list, 'customers', client)

In [None]:
cust_collection = client[f'{db_name}'].get_collection('customers')

In [None]:
# This should return 20
cust_collection.count_documents({})

In [None]:
# This should return 1 record
cust_collection.find_one()

In [None]:
# This should print all the yearly customer details. We have 8 of them
for customer in cust_collection.find({'product_subscription': 'yearly'}):
    print(customer)