# Section 1: Connect to MongoDB & explore your data

<div class="alert alert-block alert-info">
   
## Jupyter Notebook basics

- **Code cells** Cells shaded grey are code cells. As you work through the lab, run all code cells in order.
- **Running code** To run code, press Shift + Enter or click the 'Run' button on the menu bar. Where there is code already in a cell, run it as written. Where a code cell contains the comment `#Write your code here`, write code to complete the task & then run it. If needed, consult the hints & answer to enter and run the correct entry for a task before moving on to the next task. Not every command will result in visible output.
- **Markdown cells** The non-code cells are written in the Markdown markup language. Double-clicking a Markdown cell will cause it to appear in raw Markdown format. To render as text again, run the cell just like running a code cell: press Shift + Enter or click the 'Run' button on the menu bar.  
- **Restarting kernal** If the the notebook becomes unresponsive, or if either the notebook or your code displays unexpected behavior, reset the notebook by choosing "Kernal -> Restart & Clear Output" from the menu bar. This will clear all memory objects in the notebook, stop any code running, and reset the notebook to its initial state. 
- **Session timeout** - Sessions will automatically shut down after about 10 minutes of inactivity. (If you leave a lab window open in the foreground, this will generally be counted as “activity”.) See Binder docs: [How long will my Binder session last?](https://mybinder.readthedocs.io/en/latest/about/about.html?highlight=session%20last#how-long-will-my-binder-session-last)

</div>


## Introduction

In this lab you'll use PyMongo, the official Python driver for MongoDB, to connect to and work with a MongoDB database containing data on movies, users and comments on a hypothetical movie review website. 

In this first section you'll connect to the database and begin exploring your data. Specifically, you'll:

- Connect to a MongoDB instance using the `pymongo` Python driver
- Print a list of all the databases on the MongoDB instance
- Get a count of the number of documents in a specific collection
- Print out a single document from the collection 

## Setup 

This environment has MongoDB installed, our starter data loaded, and `mongod` daemon process running on the localhost default port. The PyMongo driver is also installed. 

(When you work on your own projects, you'll need to make sure you have a MongoDB instance set up and running - either on the MongoDB Atlas cloud database platform or locally installed - and the PyMongo driver installed.) 

## Tasks

# 1. Connect to the MongoDB instance
Before you can start querying the data on your MongoDB instance, you need to connect to the MongoDB instance.

In [None]:
# Replace the blanks with the missing code
from pymongo import ____
client = ____

#### <span style="color:blue">Hints</span>
- First import `MongoClient` from `pymongo`.
- Then create a `MongoClient` instance.
- Related docs: [Making a Connection with MongoClient](https://pymongo.readthedocs.io/en/stable/tutorial.html#making-a-connection-with-mongoclient)

#### <span style="color:green">Answer</span>
```python
from pymongo import MongoClient
client = MongoClient()
```

To confirm that you're connected to a MongoDB server, and to get information about the server you're connected to, run `client.server_info()`.

In [None]:
client.server_info()

### 2. Print a list of all the databases on this MongoDB instance
A single instance of MongoDB can support multiple independent databases. When you've connecting to MongoDB instance for the first time, it can be helpful to check and become aware of all the databases that are on the instance. 

Before starting on the task below, run the following cell. It imports the Python `pprint` module and method, which you'll use to print the output in a more readable format. 

In [None]:
# Import the pprint method from the native Python pprint library
from pprint import pprint

In [None]:
# Write your code here 

#### <span style="color:blue">Hints</span>
- Use the `MongoClient.list_database_names()` method ([docs](https://pymongo.readthedocs.io/en/stable/api/pymongo/mongo_client.html?highlight=list_database_names#pymongo.mongo_client.MongoClient.list_database_names))
- Use a for loop to iterate through the database names.
- Use `pprint` to print the output in a more readable format.

#### <span style="color:green">Answer</span>
```python
for db in client.list_database_names():
    pprint(db)
```

### 3. Find the number of documents in the `movies`  collection of the `sample_mflix` database
It's helpful to get a general understanding of the data a collection contains before starting to query it. 

Because data on each movie is contained in a single document, by finding the number of documents are in the collection you will learn how many movies are included in the database.

In [None]:
# Write your code to find the answer here

In [None]:
# Store the answer as `num_docs`
num_docs = ____

#### <span style="color:blue">Hints</span>
- First assign the `sample_mflix` database and `movies` collection to variables, then use a PyMongo collection-level operation to count the documents in the collection. 
- This takes three lines of code total.
- Relevant docs: [Getting a database](https://pymongo.readthedocs.io/en/stable/tutorial.html#getting-a-database), [Getting a collection](https://pymongo.readthedocs.io/en/stable/tutorial.html#getting-a-collection), and [Counting](https://pymongo.readthedocs.io/en/stable/tutorial.html#counting)


#### <span style="color:green">Answer</span>
```python
db = client.sample_mflix
collection = db.movies
collection.count_documents({})

num_docs = 13101
```

### 4. Print out all the items in the `products` collection 
What is the value of the `name` field for the item with id number 6? 

In [None]:
# Write your code to find the answer here 

In [None]:
# Store the answer as `name`
name = ____

#### <span style="color:blue">Hints</span>
- Use a PyMongo collection-level operation to query for all documents in the collection. 
- Use a for loop to iterate through the results of the query.
- Use `pprint` to print the output in a more readable format.
- See docs: [Querying for more than one document](https://pymongo.readthedocs.io/en/stable/tutorial.html#querying-for-more-than-one-document)

#### <span style="color:green">Answer</span>
```python
items = collection.find()
for item in items:
    pprint(item)

name = "JavaScript Hoodie"
```
