## 3. Pymongo Module in Python:

let’s explore how to bridge the MongoDB data and Python using pymongo.

To install the module, you need to simply write 
**pip install pymongo in your conda terminal**

In [1]:
# Import the library
import pymongo

In [2]:
# Getting the access to local MongoDB databases
databases = pymongo.MongoClient()

In [3]:
# Getting the access to `admin` database from the group of other databases present
admin_db = databases.admin

In [5]:
# Getting the access to 'Tutorial' collection that we just created inside `admin` database
tutorial_collection = admin_db.DataAnalysisVidhya

In [6]:
# Now this is where our imported `iris` data is stored. 
#To fetch one entry/record/document from the collection we can write:
tutorial_collection.find_one({})

{'_id': ObjectId('639962e1962d29d48eb20649'),
 '': 1,
 'Sepal': {'Length': '5.1', 'Width': '3.5'},
 'Petal': {'Length': '1.4', 'Width': '0.2'},
 'Species': 'setosa'}

In [7]:
tutorial_collection.find({})

<pymongo.cursor.Cursor at 0x20645b5c910>

**Note: pymongo cursor object is iterable, so here we converted it into a list to glance at all the values.**

In [8]:
list(tutorial_collection.find({}))

[{'_id': ObjectId('639962e1962d29d48eb20649'),
  '': 1,
  'Sepal': {'Length': '5.1', 'Width': '3.5'},
  'Petal': {'Length': '1.4', 'Width': '0.2'},
  'Species': 'setosa'},
 {'_id': ObjectId('639962e1962d29d48eb2064a'),
  '': 2,
  'Sepal': {'Length': '4.9', 'Width': '3'},
  'Petal': {'Length': '1.4', 'Width': '0.2'},
  'Species': 'setosa'},
 {'_id': ObjectId('639962e1962d29d48eb2064b'),
  '': 3,
  'Sepal': {'Length': '4.7', 'Width': '3.2'},
  'Petal': {'Length': '1.3', 'Width': '0.2'},
  'Species': 'setosa'},
 {'_id': ObjectId('639962e1962d29d48eb2064c'),
  '': 4,
  'Sepal': {'Length': '4.6', 'Width': '3.1'},
  'Petal': {'Length': '1.5', 'Width': '0.2'},
  'Species': 'setosa'},
 {'_id': ObjectId('639962e1962d29d48eb2064d'),
  '': 5,
  'Sepal': {'Length': '5', 'Width': '3.6'},
  'Petal': {'Length': '1.4', 'Width': '0.2'},
  'Species': 'setosa'},
 {'_id': ObjectId('639962e1962d29d48eb2064e'),
  '': 6,
  'Sepal': {'Length': '5.4', 'Width': '3.9'},
  'Petal': {'Length': '1.7', 'Width': '0.4

the list goes on till all the 150 values of the iris dataset.

## 4. Getting ready for Data Science analysis!!

We are onto the final stage that would join my blog to further down the line data science/ Analytics tasks.

We need to create a DataFrame using pandas for our MongoDB Tutorial Collection. 

### Jupyter notebooks for better interactivity.

In [9]:
import pandas as pd

iris_df = pd.DataFrame(list(tutorial_collection.find({})))
iris_df

Unnamed: 0,_id,Unnamed: 2,Sepal,Petal,Species
0,639962e1962d29d48eb20649,1,"{'Length': '5.1', 'Width': '3.5'}","{'Length': '1.4', 'Width': '0.2'}",setosa
1,639962e1962d29d48eb2064a,2,"{'Length': '4.9', 'Width': '3'}","{'Length': '1.4', 'Width': '0.2'}",setosa
2,639962e1962d29d48eb2064b,3,"{'Length': '4.7', 'Width': '3.2'}","{'Length': '1.3', 'Width': '0.2'}",setosa
3,639962e1962d29d48eb2064c,4,"{'Length': '4.6', 'Width': '3.1'}","{'Length': '1.5', 'Width': '0.2'}",setosa
4,639962e1962d29d48eb2064d,5,"{'Length': '5', 'Width': '3.6'}","{'Length': '1.4', 'Width': '0.2'}",setosa
...,...,...,...,...,...
145,639962e1962d29d48eb206da,146,"{'Length': '6.7', 'Width': '3'}","{'Length': '5.2', 'Width': '2.3'}",virginica
146,639962e1962d29d48eb206db,147,"{'Length': '6.3', 'Width': '2.5'}","{'Length': '5', 'Width': '1.9'}",virginica
147,639962e1962d29d48eb206dc,148,"{'Length': '6.5', 'Width': '3'}","{'Length': '5.2', 'Width': '2'}",virginica
148,639962e1962d29d48eb206dd,149,"{'Length': '6.2', 'Width': '3.4'}","{'Length': '5.4', 'Width': '2.3'}",virginica


**If you don’t want some of the columns you can clean them in 2 ways:**

* First is before retrieving the data from database to python code using MongoDB aggregate pipelines(Out of the scope as of now)

* The second is data cleaning after creating the DataFrame of the data.

In [2]:
# we will clear the 'id' columns by second approach,
iris_df = iris_df.drop("_id", axis=1)
# iris_df.head()

**Now further down the line, you can write the same code as any other data science/analytics task.**

From this point onwards, you can be as flexible as would want with your data science skills.