* Getting started with faker
* Setup Mongodb Database and collection
* Overview of Mongo DB CRUD Operations
* Create index on month and day of month
* Generate test data using faker
* Getting started with pymongo
* Populate generated data into Mongo Collection
* Validate data in Mongo Collection
* Exercise and Solution

In [1]:
# Getting started with faker
# python -m pip install faker
from faker import Faker

In [2]:
faker = Faker()

In [3]:
faker.name()

'Laura Gibbs'

In [4]:
faker.first_name()

'Jonathan'

In [5]:
faker.last_name()

'Key'

In [None]:
# Setup Mongodb Database and collection
# Install Mongodb community edition
# Launch Mongo Shell using mongosh and create database and collection
# mongosh
# use mailer
# db.createCollection('mails')

In [None]:
# Overview of Mongo DB CRUD Operations
# db.mails.insertOne({"fn": "Durga", "ln": "Gadiraju", "m": "d@g.com"})
# db.mails.deleteOne({'m': 'd@g.com'})

In [None]:
# Create index on month and day of month
# db.mails.createIndex({"month": 1, "day": 1}, {"name": "mails-day-1"})

In [7]:
# Generate test data using faker
rec_count = 10

In [8]:
events = ['Birth Day', 'Marriage Anniversary']

In [9]:
import random
random.choice(events)

'Birth Day'

In [10]:
recs = []

for i in range(1, rec_count + 1):
    fn = faker.first_name()
    ln = faker.last_name()
    m = f'dgadiraju+{fn.lower()}@gmail.com'
    month = faker.month()
    day = faker.day_of_month()
    e = random.choice(events)
    recs.append((i, fn, ln, m, month, day, e))

In [11]:
recs

[(1,
  'Andrea',
  'Jackson',
  'dgadiraju+andrea@gmail.com',
  '01',
  '08',
  'Birth Day'),
 (2,
  'Michael',
  'Armstrong',
  'dgadiraju+michael@gmail.com',
  '12',
  '30',
  'Marriage Anniversary'),
 (3,
  'Stacey',
  'Wallace',
  'dgadiraju+stacey@gmail.com',
  '03',
  '27',
  'Birth Day'),
 (4,
  'Brian',
  'Delgado',
  'dgadiraju+brian@gmail.com',
  '06',
  '19',
  'Marriage Anniversary'),
 (5,
  'Michael',
  'Williams',
  'dgadiraju+michael@gmail.com',
  '09',
  '22',
  'Marriage Anniversary'),
 (6,
  'Hayden',
  'Hansen',
  'dgadiraju+hayden@gmail.com',
  '03',
  '22',
  'Marriage Anniversary'),
 (7,
  'Juan',
  'Ferrell',
  'dgadiraju+juan@gmail.com',
  '02',
  '05',
  'Marriage Anniversary'),
 (8, 'Daniel', 'Long', 'dgadiraju+daniel@gmail.com', '09', '21', 'Birth Day'),
 (9, 'Evan', 'Madden', 'dgadiraju+evan@gmail.com', '03', '10', 'Birth Day'),
 (10,
  'David',
  'Lee',
  'dgadiraju+david@gmail.com',
  '02',
  '22',
  'Marriage Anniversary')]

In [12]:
# Getting started with pymongo
# python -m pip install pymongo
import pymongo

In [13]:
client = pymongo.MongoClient('localhost', 27017)

In [14]:
db = client['mailer']

In [15]:
coll = db['mails']

In [17]:
coll.delete_many({})

<pymongo.results.DeleteResult at 0x1113ae680>

In [18]:
coll.find_one()

In [19]:
import random
from faker import Faker

faker = Faker()
rec_count = 10
events = ['Birth Day', 'Marriage Anniversary']
recs = []

for i in range(1, rec_count + 1):
    fn = faker.first_name()
    ln = faker.last_name()
    m = f'dgadiraju+{fn.lower()}{i}@gmail.com'
    month = faker.month()
    day = faker.day_of_month()
    e = random.choice(events)
    recs.append({
        'fn': fn, 
        'ln': ln, 
        'm': m, 
        'month': month, 
        'day': day, 
        'e': e
    })

In [20]:
# Populate generated data into Mongo Collection
import pymongo
client = pymongo.MongoClient('localhost', 27017)
db = client['mailer']
coll = db['mails']

In [21]:
coll.insert_many(recs)

<pymongo.results.InsertManyResult at 0x1113ac4f0>

In [22]:
# Validate data in Mongo Collection
for rec in coll.find():
    print(rec)

{'_id': ObjectId('645f55a6cd531e17d1d7b601'), 'fn': 'James', 'ln': 'Schultz', 'm': 'dgadiraju+james1@gmail.com', 'month': '08', 'day': '04', 'e': 'Birth Day'}
{'_id': ObjectId('645f55a6cd531e17d1d7b602'), 'fn': 'Adam', 'ln': 'Smith', 'm': 'dgadiraju+adam2@gmail.com', 'month': '07', 'day': '12', 'e': 'Birth Day'}
{'_id': ObjectId('645f55a6cd531e17d1d7b603'), 'fn': 'Robin', 'ln': 'Hubbard', 'm': 'dgadiraju+robin3@gmail.com', 'month': '09', 'day': '15', 'e': 'Birth Day'}
{'_id': ObjectId('645f55a6cd531e17d1d7b604'), 'fn': 'Melissa', 'ln': 'Gonzalez', 'm': 'dgadiraju+melissa4@gmail.com', 'month': '01', 'day': '09', 'e': 'Marriage Anniversary'}
{'_id': ObjectId('645f55a6cd531e17d1d7b605'), 'fn': 'Amanda', 'ln': 'Terrell', 'm': 'dgadiraju+amanda5@gmail.com', 'month': '05', 'day': '10', 'e': 'Marriage Anniversary'}
{'_id': ObjectId('645f55a6cd531e17d1d7b606'), 'fn': 'Sherry', 'ln': 'Perez', 'm': 'dgadiraju+sherry6@gmail.com', 'month': '01', 'day': '25', 'e': 'Marriage Anniversary'}
{'_id': Ob

* Exercise: Load Sales data from `data/sales/part-00000` to Mongo Collection by name `sales` in database `sales_db`.
  * Create database by name `sales_db`.
  * Create collection by name `sales`.
  * Populate data from CSV file `data/sales/part-00000` to Mongo DB collection.
  * Validate by running `find` on top of the collection. It should display the 10 records.
  * Hint: Use Pandas to create dataframe and then `to_dict` on Dataframe to convert data into list of dicts to insert into Mongo DB collection using `insert_many`.

In [None]:
import pandas as pd

sales = pd.read_csv('data/sales/part-00000')

In [None]:
sales.to_dict(orient='records')

In [None]:
import pymongo

client = pymongo.MongoClient('localhost', 27017)
sales_db = client['sales_db']
sales_coll = sales_db['sales']

sales_coll.insert_many(sales.to_dict(orient='records'))

In [None]:
for rec in sales_coll.find({}):
    print(rec)