### Step 3: Loading the data into the MongoDB collection
Insert documents into our products collection.

#### Load the json file into a python data structure

In [2]:
import json 

with open ('products.json', 'rb') as fin:
    lines = fin.readlines()
    products = [json.loads(line.strip()) for line in lines]

In [3]:
import pymongo
import credentials

connection_string = f"mongodb+srv://{credentials.username}:{credentials.password}@cluster0.svxejws.mongodb.net/?retryWrites=true&w=majority"
client = pymongo.MongoClient(connection_string)

db = client['retail_company'] # this is a 'database'
collection = db['products'] # this is a 'collection'
#collection.drop() # if you rerun this notebook, you'll get an error because the collection already exists. This line will delete the collection so you can start over.


In [4]:
##Loading the data into the MongoDB collection
result = collection.insert_many(products)

###  Step 4: Demonstrate an aggregation query on the data

We are trying to find out the average size of all the different color laptops in descending order.

In [5]:
average_size_by_color = collection.aggregate([
  {
    "$match" : {"product_info.product_names": "Laptop"}  #Match documents with product name "Laptop"
  },
  {
   "$group" : {"_id": "$product_info.color", "average_size": { "$avg": "$size" }}
  },
  { 
    "$sort": { "average_size": -1 }
  }
])

The query filters and groups documents based on the product name being "Laptop", calculates the average size for each color group, and sorts the results by average size in descending order. The resulting aggregation provides the average size of "Laptop" products for each color, sorted by highest average size to lowest average size.

### Step 5: Save the results from the query to either a JSON or BSON file format.

In [6]:
import bson.json_util as bju

with open("queried_products.json", "w") as fin:
    for record in average_size_by_color:
        fin.write(bju.dumps(record, indent=2))
        fin.write('\n')

fin.close()
