# Mongo tutorial

## Prerequisites

### Documentation

You will find all documentation for :
* [Mongo commands](https://docs.mongodb.com/manual/reference/)
* [Mongo python client](http://api.mongodb.com/python/current/api/pymongo/mongo_client.html#pymongo.mongo_client.MongoClient)

### Import libraries

In [None]:
!pip install pymongo

In [2]:
import datetime
from pprint import pprint

import pymongo
from pymongo import MongoClient

In [3]:
client = MongoClient('mongo', 27017)
#le nom 'mongo' correspond au nom'image:mongo'du docker-compose.yml

In [4]:
# let's work in a test_database
db = client.test_database
posts = db.posts

In [5]:
post = {
    "author": "Mike",
    "text": "My first blog post!",
    "tags": ["mongodb", "python", "pymongo"],
    "date": datetime.datetime.utcnow()
}
post_id = posts.insert_one(post).inserted_id
post_id

ObjectId('65d61b5c3c32ca4cfe4e9018')

In [6]:
db.list_collection_names()

['posts']

In [8]:
pprint(posts.find_one())

{'_id': ObjectId('65ae65cadaf380c63f33ad66'),
 'author': 'Mike',
 'date': datetime.datetime(2024, 1, 22, 12, 55, 38, 829000),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}


You can launch a terminal aside, connect to your server with a mongo client and check that the value is present :

```bash
vagrant@nosql:~$ mongo
> show databases;
admin          0.000GB
config         0.000GB
local          0.000GB
test_database  0.000GB
> use test_database;
switched to db test_database
> db.posts.find()
{ 
    "_id" : ObjectId("..."), 
    "author" : "Mike", 
    "text" : "My first blog post!", 
    "tags" : [ "mongodb", "python", "pymongo"], 
    "date" : ISODate("2019-02-10T11:33:47.883Z") 
}
```

In [None]:
#à partir de gitbash, à refaire plus tard

## I. Quick start

### First steps

**Q** : Create a document `{msg: 'hello'}` in the `test` collection with `insert_one()`. Fetch it back to display it. What is the `_id` for ?

NB : if the collection doesn't exist yet, MongoDB automatically creates it.

In [9]:
# let's work in "test" database
db = client.test
posts = db.posts

In [10]:
doc = {
    "msg": 'hello'}

In [11]:
msg = posts.insert_one(doc)
pprint(posts.find_one())

{'_id': ObjectId('65ae6660daf380c63f33ad68'), 'msg': 'hello'}


**Q**: Display the number of documents inside the `test` collection

In [12]:
pprint(posts.count_documents({}))

2


### Interacting with a database

We have 2 `.json` files we want to interact with inside the `data` folder. Let's first dump them into a `MovieLens` database, inside `users` and `movies` collections.

For this section, you will need to read a bit on [query operators](https://docs.mongodb.com/manual/reference/operator/query/#query-selectors). Most methods on collections you will use have `filter` as a first parameter, on which you must pass a dictionary of query parameters.

**Q** : In the `MovieLens` database, load `data/movielens_movies.json` into `movies` and `data/movielens_users.json` into `users`. 

Use the dedicated shell command for this : `mongoimport --db <some_db> --collection <some_collection> --file <some_file>` 

In [None]:
#Le faire dans le shell anaconda

In [13]:
!conda install -c anaconda mongo-tools -y

Retrieving notices: ...working... done
Channels:
 - anaconda
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done


    current version: 23.11.0
    latest version: 24.1.1

Please update conda by running

    $ conda update -n base -c conda-forge conda



## Package Plan ##

  environment location: /opt/conda

  added / updated specs:
    - mongo-tools


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    blas-2.121                 |         openblas          14 KB  conda-forge
    blas-devel-3.9.0           |21_linux64_openblas          14 KB  conda-forge
    libblas-3.9.0              |21_linux64_openblas          14 KB  conda-forge
    libcblas-3.9.0             |21_linux64_openblas          14 KB  conda-forge
    liblapack-3.9.0            |21_linux64_openblas          14 KB  conda-forge
    liblapacke-3.9.0           |21_linux64_o

In [None]:
!mongoimport --db MovieLens --collection users --file data/movielens_users.json --host mongo --port 27017

In [None]:
!mongoimport --db MovieLens --collection movies --file data/movielens_movies.json --host mongo --port 27017

**Q** : how many users are in the `MovieLens` database ?

In [17]:
db=client.MovieLens
user_collection=db.users
pprint(user_collection.count_documents({}))

6040


**Q** : Display all comedies (the `genres` property equals `Comedy`). 

NB : You will need to find how to go through a `command_cursor`, then use the `pprint` function for a better display of those documents.

In [None]:
db=client.MovieLens
movies_collection=db.movies

comedy_movies_cursor = movies_collection.find({"genres": "Comedy"})

# Display the comedies using pprint
for comedy_movie in comedy_movies_cursor:
    pprint(comedy_movie)


**Q** : Fetch and display the `name` and `occupation` for Clifford Johnathan. The second paramater for `find()` ([doc here](https://api.mongodb.com/python/current/api/pymongo/collection.html#pymongo.collection.Collection.find)) is called the `projection` and is used to limit which data to fetch from the query.

In [19]:
name_occupation_cursor = user_collection.find({"name": "Clifford Johnathan"}, {"occupation":1, "name":1, "_id":0})
for name_occupation in name_occupation_cursor:
    pprint(name_occupation)

{'name': 'Clifford Johnathan', 'occupation': 'technician/engineer'}


**Q**: How many minors (by `age`) have rated movies ?

In [28]:
number_minors= user_collection.count_documents({"age": {"$lte":18}})
pprint({number_minors})

{373}


In [30]:
import re

**Q**: Display science fiction movies ('Sci-Fi') and suspense movies ('Thriller'). This time you need to use a regex to parse genres and look for those values.

In [None]:
genres=re.compile(r'Sci-Fi|Thriller', re.IGNORECASE)
resultat= movies_collection.find({"genres": {"$regex": genres}})

for movie in resultat:
    print(movie)

**Q**: If we want more advanced textual search, we need a particular index. Use the `create_index()` method to index as [TEXT](https://docs.mongodb.com/manual/core/index-text/) the `genres` field of the `movies` collection.

In [None]:
movies_collection.create_index([("genres", "text")])
resultat=movies_collection.find({"$text": {"$search": "Sci-Fi Thriller"}})
for movie in resultat:
    print(movie)

**Q**: Restart the search for science fiction and thriller movies with the operator `$text`

**Q**: Display the first 30 movies (`limit`) in alphabetical order (`sort`) by title

In [None]:
resultat=movies_collection.find().limit(30).sort("title",1)
for movie in resultat:
    print(movie)

**Q**: How many users have seen the movie "Star Wars: Episode V - The Empire Strikes Back (1980)" (`_id 1196`) ? The `movies` argument is an array so we should try the [elemMatch](https://docs.mongodb.com/manual/reference/operator/projection/elemMatch/) operator here.

In [43]:
resultat=user_collection.count_documents({"movies": {'$elemMatch': {"movieid": 1196}}})
pprint({resultat})

{2990}


**Q**: And how many gave it a rating of 1 or 2 ?

In [47]:
resultat=user_collection.count_documents({"movies": {'$elemMatch': {"movieid": 1196, "rating": {"$gte":1, "$lte":2}}}})
pprint({resultat})

{105}


### Updating data

**Q**: Insert a new user with the properties `name`, `gender` ('M' or'F'), `occupation` and `age`, using the `insert_one()` command. Display it with `find_one()`.

In [55]:
doc = {"name": 'Hugo',
       "gender": "M",
       "occupation": "Eating",
    "age": '22'}
hugo_id=user_collection.insert_one(doc).inserted_id
hugo_document = user_collection.find_one({"_id": hugo_id})
pprint(hugo_document)

{'_id': ObjectId('65ae77aedaf380c63f33ad6c'),
 'age': '22',
 'gender': 'M',
 'name': 'Hugo',
 'occupation': 'Eating'}


**Q**: Add an appreciation on a viewed movie with `update_one()`, add the movies property containing a table with a document (`movieid`, `rating`, `timestamp` with the value `datetime.datetime.utcnow()`).

You will need to read the documentation on [update operators](https://docs.mongodb.org/manual/reference/operator/update/).

In [59]:
from datetime import datetime
from bson import ObjectId

In [62]:
user_name= "Natacha"
# The movie information to be added
movie_info = {"movieid": 1196, "rating": 4, "timestamp": datetime.utcnow()}

# Use update_one to add the movie information to the "movies" property
user_collection.update_one(
    {"name": user_name},
    {
        "$push": {"movies": movie_info}
    }
)

print("Appreciation added successfully!")

Appreciation added successfully!


**Q**: Find the number of users who have declared a `programmer` occupation. Modify them so that they are `developer`. Verify your update.

In [63]:
programmer_count = user_collection.count_documents({"occupation": "programmer"})

print(f"Number of users with occupation 'programmer': {programmer_count}")

user_collection.update_many(
    {"occupation": "programmer"},
    {"$set": {"occupation": "developer"}}
)

Number of users with occupation 'programmer': 388


UpdateResult({'n': 388, 'nModified': 388, 'ok': 1.0, 'updatedExisting': True}, acknowledged=True)

In [67]:
# Verifier
updated_users = user_collection.count_documents({"occupation": "developer"})
print(updated_users)

388


## II. Modelling a blog

We will now model a blog using Mongo. 

First, switch to a new `Blog` database. Each blog post will have the following arguments:

* The author (author field, string type)
* The date (date field, string type in YYYY-MM-DD format)
* The content (field content)
* Tags (field tags, a string array)
* A list of comments (field comments) containing:
 * The author (author field, string type)
 * The date (date field, string type in YYYY-MM-DD format)
 * The content (field content)


**Q**: Create a first post by `rick`, on January 15th, with the tags `mongodb` and `nosql`.

In [15]:
db = client.blog_Natacha
posts = db.posts

In [16]:
from datetime import datetime

In [17]:
post = {
    "author": "Rick",
    "date": datetime(2024,1,15),
    "content": "Enjoy my first post",
    "tags": ["mongodb","nosql"],
    "comments":[
        {"author": "Youness",
        "date": datetime(2024,1,16),
        "content": "great Work, thanks!",
        }
    ]
}
post_id = posts.insert_one(post).inserted_id
post_id

ObjectId('65faf8e7fdd02ac710c2b7d1')

**Q**: Create a second post by `kate`, on January 21, with the tag `nosql` and a comment from `rick` on the same day.

In [18]:
post2 = {
    "author": "Kate",
    "date": datetime(2024,1,21),
    "content": "This is a post",
    "tags": ["nosql"],
    "comments":[
        {"author": "Rick",
        "date": datetime(2024,1,21),
        "content": "Great work Kate, let's have a look on my previous post!",
        }
    ]
}
post2_id = posts.insert_one(post2).inserted_id
post2_id

ObjectId('65faf90cfdd02ac710c2b7d2')

**Q**: Display the author of the last post with the tag `nosql`

In [19]:
last_nosql_post = posts.find_one({"tags": "nosql"}, sort=[("date", -1)])

if last_nosql_post:
    author = last_nosql_post["author"]
    print(f"The author of the last post with the tag 'nosql' is: {author}")
else:
    print("Tu as oublié de mettre un tag nosql.")

The author of the last post with the tag 'nosql' is: Kate


**Q**: Add a comment by `jack` on January 25, to `kate`'s post

In [20]:
kate_post = posts.find_one({"author": "Kate"})

new_comment = {
        "author": "Jack",
        "date": datetime(2024, 1, 25),
        "content": "This is a new comment by Jack in response to kate's post."
}
    
# Add the new comment to Kate's post
kate_post['comments'].append(new_comment)
    
# Update Kate's post in the collection
posts.update_one({"_id": kate_post["_id"]}, {"$set": {"comments": kate_post["comments"]}})

UpdateResult({'n': 1, 'nModified': 1, 'ok': 1.0, 'updatedExisting': True}, acknowledged=True)

**Q**: Display all comments by `kate`

In [14]:
kate_comments = posts.find({"comments.author": "Kate"})

# Afficher les commentaires de Kate
print("Commentaires de Kate :")
for post in kate_comments:
    for comment in post["comments"]:
        if comment["author"] == "Kate":
            print("Auteur :", comment["author"])
            print("Contenu :", comment["content"])
            print("Date :", comment["date"])
            print("----------------------")

Commentaires de Kate :


In [None]:
#C'est normal qu'il n'y en ai pas, Kate a fait un post et n'a pas écrit de commentaire

## Postquisites

In [None]:
!mongo test_database --eval 'db.dropDatabase()'

In [None]:
!mongo MovieLens --eval 'db.dropDatabase()'

In [None]:
!mongo Blog --eval 'db.dropDatabase()'