# Projet Movies Database

on considère, pour la suite, la base de données de Mflix Movies qui est déja installé dans le serveur MongoDB de votre Atlas Cluster.

Dans cette base de données, nous disposons d'une collection contenant les informations de plus de 23000 films .

Pour chaque film, nous avons des informations sur le:title, year, genres, countries ..

=> l'objectif est de l'exploiter..

# 1- Se connecter à la BDD Mflix

- Se connecter au serveur MongoDB de votre Atlas cluster

- Vérifier que les bases de données samples sont bien chargées sur votre cluster Atlas : 

https://www.mongodb.com/docs/atlas/sample-data/

- Afficher les collections de la base de données : sample_mflix

- Afficher le nombre de documents par collection

In [1]:
#import dependencies
import pymongo
from openpyxl.descriptors import ASCII
from pymongo import MongoClient, ASCENDING
import os
from dotenv import load_dotenv


In [2]:
try:
    load_dotenv()
    usr = os.getenv("mongo_atlas_usr")
    pwd = os.getenv("mongo_atlas_pw")
    api_key = os.getenv("mongo_altas_api")
except Exception as e:
    print(f"Error loading environment variables: {e}")
    exit()

In [3]:
#replace "uri" with your Atlas URI string - should look like mongodb+srv://...
db_name = "sample_mflix"
uri = f"mongodb+srv://{usr}:{pwd}@cluster0.xxqem.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0"
client = MongoClient(uri)
client

MongoClient(host=['cluster0-shard-00-00.xxqem.mongodb.net:27017', 'cluster0-shard-00-02.xxqem.mongodb.net:27017', 'cluster0-shard-00-01.xxqem.mongodb.net:27017'], document_class=dict, tz_aware=False, connect=True, retrywrites=True, w='majority', appname='Cluster0', authsource='admin', replicaset='atlas-lbggfj-shard-0', tls=True)

Le constructeur MongoClient accepte en plus de l'adresse du cluster Atlas, de nombreux paramètres facultatifs. 
On peut définir le pool de connexions maximal, les problèmes de lecture et d’écriture par défaut, s’il faut réessayer les écritures, configurer SSL, l’authentification ...


* Afficher les collections de la base de données : sample_mflix
* afficher le nombre de documents par collection

In [4]:
#Display available databases.
client.list_database_names()

['sample_airbnb',
 'sample_analytics',
 'sample_geospatial',
 'sample_guides',
 'sample_mflix',
 'sample_restaurants',
 'sample_supplies',
 'sample_training',
 'sample_weatherdata',
 'admin',
 'local']

In [5]:
#Let's use the sample_mflix database, display available collections

mflix = client.sample_mflix
#mflix = client['sample_mflix']
mflix.list_collection_names()


['sessions', 'comments', 'users', 'embedded_movies', 'theaters', 'movies']

In [6]:
# Afficher le nombre de documents par collection
mflix.movies.count_documents({})


21349

In [7]:
movies = mflix.movies
movies.count_documents({})

21349

# 2- Exploiter les données de Mflix

In [8]:
movies.find_one()

{'_id': ObjectId('573a1390f29313caabcd42e8'),
 'plot': 'A group of bandits stage a brazen train hold-up, only to find a determined posse hot on their heels.',
 'genres': ['Short', 'Western'],
 'runtime': 11,
 'cast': ['A.C. Abadie',
  "Gilbert M. 'Broncho Billy' Anderson",
  'George Barnes',
  'Justus D. Barnes'],
 'poster': 'https://m.media-amazon.com/images/M/MV5BMTU3NjE5NzYtYTYyNS00MDVmLWIwYjgtMmYwYWIxZDYyNzU2XkEyXkFqcGdeQXVyNzQzNzQxNzI@._V1_SY1000_SX677_AL_.jpg',
 'title': 'The Great Train Robbery',
 'fullplot': "Among the earliest existing films in American cinema - notable as the first film that presented a narrative story to tell - it depicts a group of cowboy outlaws who hold up a train and rob the passengers. They are then pursued by a Sheriff's posse. Several scenes have color included - all hand tinted.",
 'languages': ['English'],
 'released': datetime.datetime(1903, 12, 1, 0, 0),
 'directors': ['Edwin S. Porter'],
 'rated': 'TV-G',
 'awards': {'wins': 1, 'nominations': 0, 

* Combien ya t il de documents avec l'attribut year en string? et combien sont ils en int?

In [9]:
# Requête d'agrégation pour compter les types de 'year' et 'int'
pipeline = [
    {
        "$facet": {
            "year_string_format": [
                {"$match": {"year": {"$type": "string"}}},
                {"$count": "count"}
            ],
            "year_int_format": [
                {"$match": {"year": {"$type": "int"}}},
                {"$count": "count"}
            ],
            "total": [
                {"$count": "count"}
            ]
        }
    }
]

# Exécution de l'agrégation
result = list(movies.aggregate(pipeline))
result

[{'year_string_format': [{'count': 35}],
  'year_int_format': [{'count': 21314}],
  'total': [{'count': 21349}]}]

 
- Combien de films ou "Salma Hayek" a participé, afficher ces documents en se limitant au title et la liste des acteurs


In [10]:
movies.count_documents({"cast": {"$in": ["Salma Hayek"]}})

20

* Combien de films nommés: "The Journey", la fonction find() retourne un objet itérable et non pas un document !
* Afficher pour chaque document :le titre, l'année et la liste des genres .

In [11]:
#using find
result = movies.find(
    {
        "title": "The Journey"
    },
    {
        'genres': 1,
        'title': 1,
        'year': 1,
        '_id': 0
    }
)

# Converting the result cursor to a list
the_journey_list = list(result)
the_journey_list

[{'genres': ['Drama'], 'title': 'The Journey', 'year': 1986},
 {'genres': ['Drama', 'History'], 'title': 'The Journey', 'year': 1992},
 {'genres': ['Comedy', 'Drama'], 'title': 'The Journey', 'year': 1997},
 {'genres': ['Documentary', 'Adventure'],
  'title': 'The Journey',
  'year': 2001},
 {'genres': ['Drama', 'Romance'], 'title': 'The Journey', 'year': 2004},
 {'genres': ['Comedy', 'Drama', 'Family'],
  'title': 'The Journey',
  'year': 2014}]

In [12]:
pipeline = [
    {
        "$match": {
            "title": 'The Journey'
        }
    },
    {
        "$project": {
            'genres': 1,
            "title": 1,
            "year": 1,
            '_id': 0
        }
    }]

the_journey_list = list(movies.aggregate(pipeline))

print(len(the_journey_list))
the_journey_list

6


[{'genres': ['Drama'], 'title': 'The Journey', 'year': 1986},
 {'genres': ['Drama', 'History'], 'title': 'The Journey', 'year': 1992},
 {'genres': ['Comedy', 'Drama'], 'title': 'The Journey', 'year': 1997},
 {'genres': ['Documentary', 'Adventure'],
  'title': 'The Journey',
  'year': 2001},
 {'genres': ['Drama', 'Romance'], 'title': 'The Journey', 'year': 2004},
 {'genres': ['Comedy', 'Drama', 'Family'],
  'title': 'The Journey',
  'year': 2014}]

## faites le max des requetes ci dessous:

- afficher les 12 titres de la collection movies à partir du dixième inclus ,

- afficher le résultat trié par ordre alphabétique décroissant

- Lister ts les films produits en 1979 ?

- Afficher les infos sur le film dont le title est "Alien" produit en 1979 ?

- Trouver ts les films qui ont gagné 100 awards ?

- Tous les films dont le titre commence par 'Re' ?

- Tous les films produits après 2010 et avant 2015 ?

- Afficher les films joués par l'acteur "Tom Cruise",sortis après  l'année 2000, en se limitant au titre du film et la liste des acteurs ?

- Afficher tous les détails du film de "Tom Cruise",sorti en 2014, sauf son fullplot

- Chercher les films dans lesquels joue au moins un des acteurs suivants : Angelina Jolie,Brad Pitt

- Chercher les films dont on ne trouve aucun acteur de la liste suivante: Sandra Bullock,Tom Hanks,Julia Roberts,Kevin Spacey,George Clooney

- Lister les films parus en 2016 ou avec qui ont gagné 100 awards

- Lister les films parus après 2010 et dans lesquels l'un de ces acteurs a jouéSandra Bullock,Tom Hanks,Julia Roberts,Kevin Spacey,George Clooney

- Combien de films a comme director Clint Eastwood 

- De la liste précédente  et dans la collection comments de la BDD sample_mflix, combien de films ont été commentés




In [13]:
#afficher les 12 titres de la collection movies à partir du dixième inclu:
[movie['title'] for movie in movies.find().skip(9).limit(12)]

['Civilization',
 'Where Are My Children?',
 'The Poor Little Rich Girl',
 'Wild and Woolly',
 'The Blue Bird',
 'From Hand to Mouth',
 'High and Dizzy',
 'One Week',
 'The Saphead',
 'The Ace of Hearts',
 'The Four Horsemen of the Apocalypse',
 'Miss Lulu Bett']

In [14]:
#afficher les 12 titres de la collection movies à partir du dixième inclus, ordre décroissant
pipeline = [
    {
        "$sort": {
            "title": -1
        }
    },
    {
        "$project": {
            "title": 1,
            "_id": 0
        }

    },
    {
        "$skip": 9
    },
    {
        "$limit": 12
    }
]

result = list(movies.aggregate(pipeline))
result

[{'title': 'èVivan las Antipodas!'},
 {'title': 'èAy, Carmela!'},
 {'title': 'èA volar joven!'},
 {'title': 'è nos amours'},
 {'title': 'è moi seule'},
 {'title': 'è Paè, è'},
 {'title': 'è Nous la Libertè'},
 {'title': 'xXx: State of the Union'},
 {'title': 'xXx'},
 {'title': 'tom thumb'},
 {'title': 's/y Glèdjen'},
 {'title': 'iMurders'}]

In [15]:
# Lister tous les films produits en 1979 ?
pipeline = [
    {
        "$match": {
            "year": 1979
        }
    },
    {
        "$project": {
            "result": ["$title", "$year"],
            "_id": 0
        }
    }
]
[([a for _, a in v.items()][0]) for v in movies.aggregate(pipeline)]

[['Julio Begins in July', 1979],
 ['Life Sentence', 1979],
 ['Buck Rogers in the 25th Century', 1979],
 ['Family Nest', 1979],
 ['Hungarian Rhapsody', 1979],
 ['Cop or Hood', 1979],
 ['Traffic Jam', 1979],
 ['Iskanderija... lih?', 1979],
 ['Joi Baba Felunath: The Elephant God', 1979],
 ['The Clonus Horror', 1979],
 ['Un uomo in ginocchio', 1979],
 ['...And Justice for All.', 1979],
 ['10', 1979],
 ['Agatha', 1979],
 ["The Concorde... Airport '79", 1979],
 ['Alien', 1979],
 ['The Amityville Horror', 1979],
 ['Love on the Run', 1979],
 ['Camera Buff', 1979],
 ['All That Jazz', 1979],
 ['Apocalypse Now', 1979],
 ['Arabian Adventure', 1979],
 ['Rapture', 1979],
 ['Being There', 1979],
 ['Beyond the Poseidon Adventure', 1979],
 ['The Black Hole', 1979],
 ['Best Boy', 1979],
 ['The Black Stallion', 1979],
 ['The Tin Drum', 1979],
 ['The Brood', 1979],
 ['The Bugs Bunny/Road-Runner Movie', 1979],
 ['Buffet Froid', 1979],
 ['Breaking Away', 1979],
 ['The Champ', 1979],
 ['Chapter Two', 1979],


In [16]:
#Afficher les infos sur le film dont le title est "Alien" produit en 1979 ?
pipeline = [
    {
        "$match": {
            "title": "Alien",
            "year": 1979
        }
    },
    {
        "$project": {
            "title": 1,
            "year": 1,
            "cast": 1,
            '_id': 0
        }
    }
]
list(movies.aggregate(pipeline))

[{'cast': ['Tom Skerritt',
   'Sigourney Weaver',
   'Veronica Cartwright',
   'Harry Dean Stanton'],
  'title': 'Alien',
  'year': 1979}]

In [17]:
#Trouver ts les films qui ont gagné 100 awards ?
pipeline = [
    {
        "$match": {
            "awards.wins": {
                "$gte": 100
            }
        }
    },
    {
        "$project": {
            "title": 1,
            "_id": 0
        }
    }
]
movies.aggregate(pipeline)

<pymongo.command_cursor.CommandCursor at 0x17684019fd0>

In [18]:
#Tous les films dont le titre commence par 'Re' ?
pipeline = [
    {
        "$match": {
            "title": {
                "$regex": "^Re"
            }
        }
    },
    {
        "$project": {
            "title": 1,
            "_id": 0
        }
    }
]
list(movies.aggregate(pipeline))

[{'title': 'Regeneration'},
 {'title': 'Red Dust'},
 {'title': 'Rembrandt'},
 {'title': 'Report from the Aleutians'},
 {'title': 'Red Meadows'},
 {'title': 'Red River'},
 {'title': 'Repast'},
 {'title': 'Rear Window'},
 {'title': 'Rebel Without a Cause'},
 {'title': 'Requiem for a Heavyweight'},
 {'title': 'Rece do gèry'},
 {'title': 'Rengè kantai shirei chèkan: Yamamoto Isoroku'},
 {'title': 'Reconstruction'},
 {'title': 'Red Psalm'},
 {'title': 'Reminiscences of a Journey to Lithuania'},
 {'title': 'Release the Prisoners to Spring'},
 {'title': 'Remember My Name'},
 {'title': 'Revenge of the Pink Panther'},
 {'title': 'Resurrection'},
 {'title': 'Return of the Secaucus Seven'},
 {'title': 'Revenge of the Stepford Wives'},
 {'title': 'Reds'},
 {'title': 'Red Bells Part I: Mexico on Fire'},
 {'title': 'Reuben, Reuben'},
 {'title': 'Reilly: Ace of Spies'},
 {'title': 'Rembetiko'},
 {'title': 'Repo Man'},
 {'title': 'Red Dawn'},
 {'title': 'Real Genius'},
 {'title': 'Red Sonja'},
 {'titl

In [19]:
#Tous les films produits après 2010 et avant 2015 ?
pipeline = [
    {
        "$match": {
            "year": {
                "$gte": 2010,
                "$lte": 2015
            }
        }
    },
    {
        "$project": {
            "title": 1,
            "year": 1,
            "_id": 0
        }
    }
]
list(movies.aggregate(pipeline))

[{'title': 'Pèl Adrienn', 'year': 2010},
 {'year': 2010, 'title': 'In My Sleep'},
 {'year': 2012, 'title': 'On the Road'},
 {'year': 2013, 'title': 'The Secret Life of Walter Mitty'},
 {'title': 'Jurassic World', 'year': 2015},
 {'title': 'The Pacific', 'year': 2010},
 {'title': 'The Rum Diary', 'year': 2011},
 {'title': 'Gnomeo & Juliet', 'year': 2011},
 {'title': 'The Three Stooges', 'year': 2012},
 {'title': 'The Crimson Petal and the White', 'year': 2011},
 {'title': 'Tangled', 'year': 2010},
 {'title': 'John Carter', 'year': 2012},
 {'title': 'Utomlyonnye solntsem 2: Predstoyanie', 'year': 2010},
 {'title': 'Action Jackson', 'year': 2014},
 {'year': 2013, 'title': 'In Secret'},
 {'year': 2011, 'title': 'Cowboys & Aliens'},
 {'title': 'The Stanford Prison Experiment', 'year': 2015},
 {'title': 'Hemingway & Gellhorn', 'year': 2012},
 {'year': 2010, 'title': 'Dinner for Schmucks'},
 {'title': 'The A-Team', 'year': 2010},
 {'title': 'The Possession', 'year': 2012},
 {'year': 2011, 'ti

In [20]:
# Afficher les films joués par l'acteur "Tom Cruise", sortis après l'année 2000, en se limitant au titre du film et la liste des acteurs ?
pipeline = [
    {
        "$match": {
            "cast": "Tom Cruise",
            "year": {
                "$gt": 2000
            }
        }
    },
    {
        "$project": {
            "title": 1,
            "cast": 1,
            "year": 1,
            "_id": 0
        }
    },
    {
        "$sort": {
            "year": -1
        }
    }
]
list(movies.aggregate(pipeline))

[{'cast': ['Tom Cruise', 'Emily Blunt', 'Brendan Gleeson', 'Bill Paxton'],
  'title': 'Edge of Tomorrow',
  'year': 2014},
 {'year': 2013,
  'title': 'Oblivion',
  'cast': ['Tom Cruise',
   'Morgan Freeman',
   'Olga Kurylenko',
   'Andrea Riseborough']},
 {'year': 2012,
  'title': 'Jack Reacher',
  'cast': ['Tom Cruise', 'Rosamund Pike', 'Richard Jenkins', 'David Oyelowo']},
 {'year': 2011,
  'title': 'Mission: Impossible - Ghost Protocol',
  'cast': ['Tom Cruise', 'Paula Patton', 'Simon Pegg', 'Jeremy Renner']},
 {'year': 2010,
  'title': 'Knight and Day',
  'cast': ['Tom Cruise', 'Cameron Diaz', 'Peter Sarsgaard', 'Jordi Mollè']},
 {'cast': ['Tom Cruise', 'Kenneth Branagh', 'Bill Nighy', 'Tom Wilkinson'],
  'title': 'Valkyrie',
  'year': 2008},
 {'cast': ['Robert Redford', 'Meryl Streep', 'Tom Cruise', 'Michael Peèa'],
  'title': 'Lions for Lambs',
  'year': 2007},
 {'year': 2006,
  'title': 'Mission: Impossible III',
  'cast': ['Tom Cruise',
   'Philip Seymour Hoffman',
   'Ving Rh

In [21]:
#Afficher tous les détails du film de "Tom Cruise",sorti en 2014, sauf son fullplot
list(movies.find({"cast": "Tom Cruise", "year": 2014}, {'_id': 0, 'fullplot': 0}))

[{'plot': 'A military officer is brought into an alien war against an extraterrestrial enemy who can reset the day and know the future. When this officer is enabled with the same power, he teams up with a Special Forces warrior to try and end the war.',
  'genres': ['Action', 'Adventure', 'Sci-Fi'],
  'runtime': 113,
  'metacritic': 71,
  'rated': 'PG-13',
  'cast': ['Tom Cruise', 'Emily Blunt', 'Brendan Gleeson', 'Bill Paxton'],
  'poster': 'https://m.media-amazon.com/images/M/MV5BMTc5OTk4MTM3M15BMl5BanBnXkFtZTgwODcxNjg3MDE@._V1_SY1000_SX677_AL_.jpg',
  'title': 'Edge of Tomorrow',
  'languages': ['English'],
  'released': datetime.datetime(2014, 6, 6, 0, 0),
  'directors': ['Doug Liman'],
  'writers': ['Christopher McQuarrie (screenplay)',
   'Jez Butterworth (screenplay)',
   'John-Henry Butterworth (screenplay)',
   'Hiroshi Sakurazaka (novel)'],
  'awards': {'wins': 12,
   'nominations': 28,
   'text': '12 wins & 28 nominations.'},
  'lastupdated': '2015-08-22 00:03:12.767000000',

In [22]:
#Chercher les films dans lesquels joue au moins un des acteurs suivants : Angelina Jolie, Brad Pitt
pipeline = [
    {
        "$match": {
            "$or": [
                {"cast": "Angelina Jolie"},
                {"cast": "Brad Pitt"}
            ]
        }
    },
    {
        "$project": {
            "title": 1,
            "cast": 1,
            "_id": 0
        }
    }
]
list(movies.aggregate(pipeline))

[{'title': 'Cool World',
  'cast': ['Kim Basinger', 'Gabriel Byrne', 'Brad Pitt', 'Michele Abrams']},
 {'title': 'Johnny Suede',
  'cast': ['Brad Pitt', 'Richard Boes', 'Cheryl Costa', 'Michael Luciano']},
 {'title': 'A River Runs Through It',
  'cast': ['Craig Sheffer', 'Brad Pitt', 'Tom Skerritt', 'Brenda Blethyn']},
 {'title': 'Kalifornia',
  'cast': ['Brad Pitt', 'Kathy Larson', 'David Milford', 'David Duchovny']},
 {'title': 'Interview with the Vampire: The Vampire Chronicles',
  'cast': ['Brad Pitt',
   'Christian Slater',
   'Virginia McCollam',
   'John McConnell']},
 {'title': 'Legends of the Fall',
  'cast': ['Brad Pitt', 'Anthony Hopkins', 'Aidan Quinn', 'Julia Ormond']},
 {'title': 'Se7en',
  'cast': ['Morgan Freeman',
   'Andrew Kevin Walker',
   'Daniel Zacapa',
   'Brad Pitt']},
 {'cast': ['Dana Delany', 'Annabeth Gish', 'Angelina Jolie', 'Tina Majorino'],
  'title': 'True Women'},
 {'title': 'Meet Joe Black',
  'cast': ['Brad Pitt', 'Anthony Hopkins', 'Claire Forlani', 

In [23]:
list(movies.find(
    {"$or":
         [{"cast": "Angelina Jolie"},
          {"cast": "Brad Pitt"}]},
    {"_id": 0, "cast": 1}))


[{'cast': ['Kim Basinger', 'Gabriel Byrne', 'Brad Pitt', 'Michele Abrams']},
 {'cast': ['Brad Pitt', 'Richard Boes', 'Cheryl Costa', 'Michael Luciano']},
 {'cast': ['Craig Sheffer', 'Brad Pitt', 'Tom Skerritt', 'Brenda Blethyn']},
 {'cast': ['Brad Pitt', 'Kathy Larson', 'David Milford', 'David Duchovny']},
 {'cast': ['Brad Pitt',
   'Christian Slater',
   'Virginia McCollam',
   'John McConnell']},
 {'cast': ['Brad Pitt', 'Anthony Hopkins', 'Aidan Quinn', 'Julia Ormond']},
 {'cast': ['Morgan Freeman',
   'Andrew Kevin Walker',
   'Daniel Zacapa',
   'Brad Pitt']},
 {'cast': ['Dana Delany', 'Annabeth Gish', 'Angelina Jolie', 'Tina Majorino']},
 {'cast': ['Brad Pitt', 'Anthony Hopkins', 'Claire Forlani', 'Jake Weber']},
 {'cast': ['Brad Pitt', 'David Thewlis', 'BD Wong', 'Mako']},
 {'cast': ['John Cusack',
   'Billy Bob Thornton',
   'Cate Blanchett',
   'Angelina Jolie']},
 {'cast': ['Angelina Jolie',
   'Elizabeth Mitchell',
   'Eric Michael Cole',
   'Kylie Travis']},
 {'cast': ['Edwa

In [24]:
#Chercher les films dont on ne trouve aucun acteur de la liste suivante: Sandra Bullock,Tom Hanks,Julia Roberts,Kevin Spacey,George Clooney
pipeline = [
    {
        "$match": {
            "$and": [
                {"cast": {"$ne": "Sandra Bullock"}},
                {"cast": {"$ne": "Tom Hanks"}},
                {"cast": {"$ne": "Julia Roberts"}},
                {"cast": {"$ne": "Kevin Spacey"}},
                {"cast": {"$ne": "George Clooney"}}
            ]
        }
    },
    {
        "$project": {
            "title": 1,
            "cast": 1,
            "_id": 0
        }
    }
]
list(movies.aggregate(pipeline))

[{'cast': ['A.C. Abadie',
   "Gilbert M. 'Broncho Billy' Anderson",
   'George Barnes',
   'Justus D. Barnes'],
  'title': 'The Great Train Robbery'},
 {'cast': ['Frank Powell',
   'Grace Henderson',
   'James Kirkwood',
   'Linda Arvidson'],
  'title': 'A Corner in Wheat'},
 {'cast': ['Winsor McCay'],
  'title': 'Winsor McCay, the Famous Cartoonist of the N.Y. Herald and His Moving Comics'},
 {'cast': ['Jane Gail', 'Ethel Grandin', 'William H. Turner', 'Matt Moore'],
  'title': 'Traffic in Souls'},
 {'cast': ['Winsor McCay', 'George McManus', 'Roy L. McCardell'],
  'title': 'Gertie the Dinosaur'},
 {'cast': ['Stanley Hunt',
   'Sarah Constance Smith Hunt',
   'Mrs. George Walkus',
   "Paddy 'Malid"],
  'title': 'In the Land of the Head Hunters'},
 {'cast': ['Pearl White', 'Crane Wilbur', 'Paul Panzer', 'Edward Josè'],
  'title': 'The Perils of Pauline'},
 {'cast': ['George Beban', 'Clara Williams', 'J. Frank Burke', 'Leo Willis'],
  'title': 'The Italian'},
 {'cast': ['John McCann', '

In [25]:
# Lister les films parus en 2016 ou avec qui ont gagné 100 awards
pipeline = [
    {
        "$match": {
            "$or": [
                {"year": 2016},
                {"awards.wins": {"$gte": 100}}
            ]
        }
    },
    {
        "$project": {
            "title": 1,
            "_id": 0
        }
    }
]
list(movies.aggregate(pipeline))

[{'title': 'Titanic'},
 {'title': 'The Lord of the Rings: The Fellowship of the Ring'},
 {'title': 'The Lord of the Rings: The Return of the King'},
 {'title': 'The Lord of the Rings: The Two Towers'},
 {'title': 'American Beauty'},
 {'title': 'Crouching Tiger, Hidden Dragon'},
 {'title': 'Far from Heaven'},
 {'title': 'Lost in Translation'},
 {'title': 'Inglourious Basterds'},
 {'title': 'Sideways'},
 {'title': 'Brokeback Mountain'},
 {'title': 'The Departed'},
 {'title': 'Lincoln'},
 {'title': "Pan's Labyrinth"},
 {'title': 'Juno'},
 {'title': 'The Dark Knight'},
 {'title': 'There Will Be Blood'},
 {'title': 'No Country for Old Men'},
 {'title': 'The Tree of Life'},
 {'title': 'The Hurt Locker'},
 {'title': 'Precious'},
 {'title': 'Black Swan'},
 {'title': 'Slumdog Millionaire'},
 {'title': 'Boyhood'},
 {'title': 'Precious'},
 {'title': 'Up in the Air'},
 {'title': 'The Social Network'},
 {'title': 'Inception'},
 {'title': 'Gravity'},
 {'title': "The King's Speech"},
 {'title': 'The 

In [26]:
# Lister les films parus après 2010 et dans lesquels l'un de ces acteurs a joué Sandra Bullock,Tom Hanks,Julia Roberts,Kevin Spacey,George Clooney
pipeline = [
    {
        "$match": {
            "$and": [
                {"year": {"$gt": 2010}},
                {"cast": {"$in": ["Sandra Bullock", "Tom Hanks", "Julia Roberts", "Kevin Spacey", "George Clooney"]}}
            ]
        }
    },
    {
        "$project": {
            "title": 1,
            "cast": 1,
            "_id": 0
        }
    }
]
list(movies.aggregate(pipeline))


[{'cast': ['Tom Hanks', 'Thomas Horn', 'Sandra Bullock', 'Zoe Caldwell'],
  'title': 'Extremely Loud & Incredibly Close'},
 {'cast': ['Kevin Spacey', 'Daniel Wu', 'Beibi Gong', 'Ni Yan'],
  'title': 'Inseparable'},
 {'title': 'The Descendants',
  'cast': ['George Clooney',
   'Shailene Woodley',
   'Amara Miller',
   'Nick Krause']},
 {'cast': ['Ryan Gosling',
   'George Clooney',
   'Philip Seymour Hoffman',
   'Paul Giamatti'],
  'title': 'The Ides of March'},
 {'title': 'August: Osage County',
  'cast': ['Meryl Streep', 'Julia Roberts', 'Chris Cooper', 'Ewan McGregor']},
 {'title': 'Cloud Atlas',
  'cast': ['Tom Hanks', 'Halle Berry', 'Jim Broadbent', 'Hugo Weaving']},
 {'title': 'Gravity',
  'cast': ['Sandra Bullock',
   'George Clooney',
   'Ed Harris',
   'Orto Ignatiussen']},
 {'cast': ['Jason Bateman', 'Steve Wiebe', 'Kevin Spacey', 'Charlie Day'],
  'title': 'Horrible Bosses'},
 {'title': 'Captain Phillips',
  'cast': ['Tom Hanks',
   'Catherine Keener',
   'Barkhad Abdi',
   

In [27]:
#Combien de films a comme director Clint Eastwood 
pipeline = [
    {
        "$match": {
            "directors": "Clint Eastwood"
        }
    },
    {
        "$count": "count"
    }
]
list(movies.aggregate(pipeline))

[{'count': 27}]

In [28]:
#Combien de films a comme director Clint Eastwood 
movies.count_documents({"directors": "Clint Eastwood"})

27

In [29]:
#set comments variables comments
comments = mflix.comments

In [30]:
#retrieves list movies id de Client Estwood
list_films = list(movies.find({"directors": "Clint Eastwood"}, {"_id": 1}).distinct('_id'))
# list_films

In [31]:
#De la liste précédente et dans la collection comments de la BDD sample_mflix, combien de films ont été commentés
pipeline = [
    {
        "$match": {
            "movie_id": {
                "$in": list_films
            }
        }
    },
    {
        "$count": "Client Estwood commented movie nb"
    }
]
list(comments.aggregate(pipeline))

[{'Client Estwood commented movie nb': 9}]

In [32]:
#Combien de films ont eu la note "PG-13" (indice : clé "rated") ? Afficher le 1er document ?
#Faire une projection dessus pour se limiter aux infos suivantes : _id, title, rated, year, writers et actors ? 
#Afficher de nouveau ces dernières infos sans l'_id ?
pipeline = [
    {
        "$match": {
            "rated": "PG-13"
        }
    },
    {
        "$project": {
            "_id": 0,
            "title": 1,
            "rated": 1,
            "year": 1,
            "writers": 1,
            "actors": 1
        }
    },
    {
        "$limit": 1
    }
]
list(movies.aggregate(pipeline))

[{'rated': 'PG-13',
  'title': 'Wings',
  'writers': ['John Monk Saunders (story)',
   'Hope Loring (screenplay)',
   'Louis D. Lighton (screenplay)',
   'Julian Johnson (titles)'],
  'year': 1927}]

In [53]:
#Trouver les films ayant la note "PG-13" et produits en 2009 ?
pipeline = [
    {
        "$match": {
            "rated": "PG-13",
            "year": 2009,
        }
    },
    {
        "$project": {
            "_id": 1,
            'year': 1
        }
    }
]
len(list(movies.aggregate(pipeline)))


100

In [54]:
#Combien de films ont comme sous-clé "meter" de la clé "tomatoes" égale à 100 ?
pipeline = [
    {
        "$match": {
            "tomatoes.viewer.meter": {
                "$gte": 100,
                "$exists": True
            }
        }
    },
    {
        "$project": {
            "tomatoes": 1
        }
    }
]
list(movies.aggregate(pipeline))

[{'_id': ObjectId('573a1391f29313caabcd952a'),
  'tomatoes': {'viewer': {'rating': 4.0, 'numReviews': 17, 'meter': 100},
   'lastUpdated': datetime.datetime(2015, 8, 26, 18, 1, 49)}},
 {'_id': ObjectId('573a1392f29313caabcd99e3'),
  'tomatoes': {'viewer': {'rating': 4.0, 'numReviews': 119, 'meter': 100},
   'production': 'Paramount Pictures',
   'lastUpdated': datetime.datetime(2015, 8, 9, 18, 15, 6)}},
 {'_id': ObjectId('573a1392f29313caabcd9b68'),
  'tomatoes': {'viewer': {'rating': 4.1, 'numReviews': 150, 'meter': 100},
   'lastUpdated': datetime.datetime(2015, 7, 3, 18, 40, 4)}},
 {'_id': ObjectId('573a1392f29313caabcd9c4c'),
  'tomatoes': {'viewer': {'rating': 4.2, 'numReviews': 104, 'meter': 100},
   'dvd': datetime.datetime(2009, 2, 24, 0, 0),
   'production': 'Divisa Home Video',
   'lastUpdated': datetime.datetime(2015, 9, 1, 18, 51, 7)}},
 {'_id': ObjectId('573a1392f29313caabcda653'),
  'tomatoes': {'viewer': {'rating': 4.0, 'numReviews': 49, 'meter': 100},
   'production': '

In [61]:
#confirm the above result
movies.find_one(
    {
        "tomatoes.viewer.meter": {'$exists': True}
    },
    {
        "_id": True,
        "tomatoes.viewer.meter": True
    },
    sort=[("tomatoes.viewer.meter", -1)])

{'_id': ObjectId('573a1391f29313caabcd952a'),
 'tomatoes': {'viewer': {'meter': 100}}}

In [36]:
#Combien de films dont "Jeff Bridges" a joué dedans ?
movies.count_documents(
    {"cast": "Jeff Bridges"}
)


44

In [37]:
#idem mais "Jeff Bridges" se trouve ds la 1è position de l'array' "actors" ?
movies.count_documents(
    {"cast.0": "Jeff Bridges"}
)

25

In [38]:
#Combien de films dont le "runtime" est sup ou égale à 90 min ET inf ou égale à 120 min ?
movies.count_documents({
    "runtime": {"$gte": 90,
                "$lte": 120}
})


13060

In [40]:
#Combien de films dont le "runtime" est sup ou égale à 90 min ET inf ou égale à 120 min ?
pipeline = [
    {
        "$match": {
            "runtime": {"$gte": 90,
                        "$lte": 120}
        }
    },
    {
        "$count": "runtime_counter"
    }
]
list(movies.aggregate(pipeline))

[{'runtime_counter': 13060}]

In [44]:
#Combien de films dont le meter de la clé tomatoes est sup à 95 OU le "metacritic" est sup à 88 ?
movies.count_documents({"tomatoes.viewer.meter": {'$gte': 30},
                        "tomatoes.critic.meter": {'$gte': 30}})

8618

In [47]:
#Combien de films dont le meter de la clé tomatoes est sup ou égale à 95 min ET le "runtime" est sup à 180 min
movies.count_documents({"tomatoes.viewer.meter": {'$gte': 95},
                        "runtime": {'$gte': 180}})

26

In [67]:
#Combien de films dont le meter de la clé tomatoes est sup ou égale à 95 min ET le "runtime" est sup à 180 min
pipeline = [
    {
        "$match": {
            "$and": [
                {"tomatoes.viewer.meter": {"$gte": 95}},
                {"runtime": {"$gte": 180}}
            ]
        },
    },
]
len(list(movies.aggregate(pipeline)))

26

In [76]:
#Combien de films dont la clé "tomatoes.viewer.meter" n'est pas égale à "blabla" ?
movies.count_documents({"tomatoes.viewer.meter": {"$ne": "blabla"}})  # same as movies.count_documents({})

21349

In [79]:
#Combien de films dont la clé "tomato.meter" existe (et inversement)
movies.count_documents({"tomatoes.viewer.meter": {"$exists": 1}})

16199

In [80]:
#Combien de films dont la clé "tomato.meter" existe (et inversement)
movies.count_documents({"tomatoes.viewer.meter": {"$exists": 0}})

5150

In [84]:
#modifier dans le document ayant le title : Blacksmith Scene, la valeur de l'attribut movie à "blabla"
movies.update_one(
    {"title": "Blacksmith Scene"},
    {"$set": {"movie": "blabla"}}
)

UpdateResult({'n': 0, 'electionId': ObjectId('7fffffff00000000000000b2'), 'opTime': {'ts': Timestamp(1725309777, 92), 't': 178}, 'nModified': 0, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1725309777, 92), 'signature': {'hash': b'}XO\xbc\x82\x16\x99\xe4G\x85\xd3\xcdg\xb20\xc6v\xe1;\xbe', 'keyId': 7351789228159664129}}, 'operationTime': Timestamp(1725309777, 92), 'updatedExisting': False}, acknowledged=True)

In [109]:
#Combien de films ont été écrit par ""Ethan Coen" et "Joel Coen"" ?
pipeline = [
    {"$match": {
        "writers": {"$in": ["Ethan Coen", "Joel Coen"]}
    }}
]
len(list(movies.aggregate(pipeline)))

14

In [110]:
movies.count_documents({"writers": {"$in": ["Joel Coen", "Ethan Coen"]}})

14

In [127]:
movies.find({"writers": {"$eq": ["Ethan Coen", "Joel Coen"]}}).distinct("_id")

[ObjectId('573a1398f29313caabcea992'),
 ObjectId('573a139af29313caabcefe7e'),
 ObjectId('573a139af29313caabcf0797')]

In [128]:
movies.find({"writers": {"$eq": ["Joel Coen", "Ethan Coen"]}}).distinct("_id")

[ObjectId('573a1398f29313caabce8fe0'),
 ObjectId('573a1399f29313caabcec141'),
 ObjectId('573a1399f29313caabcec611'),
 ObjectId('573a13a3f29313caabd0db04'),
 ObjectId('573a13bbf29313caabd52b17'),
 ObjectId('573a13bdf29313caabd597d1'),
 ObjectId('573a13d5f29313caabd9d807')]

In [141]:
#Combien de film ont été produit par un seul pays ?
pipeline = [
    {"$group": {"_id": "$countries", "total": {"$sum": 1}}},
    {"$sort": {"total": -1}},
]
list(movies.aggregate(pipeline))[:5]

[{'_id': ['USA'], 'total': 8225},
 {'_id': ['UK'], 'total': 999},
 {'_id': ['France'], 'total': 717},
 {'_id': ['Japan'], 'total': 556},
 {'_id': ['Canada'], 'total': 498}]

In [143]:
pipeline = [
    {
        "$unwind": "$countries"  # Décompose les tableaux de pays en documents individuels
    },
    {
        "$group": {
            "_id": "$countries",  # Grouper par pays de production
            "filmCount": {"$sum": 1}  # Compter le nombre de films pour chaque pays
        }
    },
    {
        "$project": {
            "_id": 0,  # Ne pas inclure l'ID de groupe dans les résultats
            "country": "$_id",  # Renommer l'ID du groupe en "country"
            "filmCount": 1  # Inclure le compte des films
        }
    },
    {
        "$sort": {"filmCount": -1}
    }
]
list(movies.aggregate(pipeline))[:5]

[{'filmCount': 10921, 'country': 'USA'},
 {'filmCount': 2652, 'country': 'UK'},
 {'filmCount': 2647, 'country': 'France'},
 {'filmCount': 1494, 'country': 'Germany'},
 {'filmCount': 1260, 'country': 'Canada'}]

In [144]:
#Combien de film dont la clé genre est un tableau qui contient Comedy, Crime et Drame ?
pipeline = [
    {
        "$match": {
            "genres": {
                "$all": ["Comedy", "Crime", "Drama"],  # Le tableau doit contenir tous ces genres
            }
        }
    },
    {
        "$count": "numMovies"  # Compte le nombre de documents qui correspondent au critère
    }
]
list(movies.aggregate(pipeline))

[{'numMovies': 155}]

In [168]:
result=list(movies.find({
    "countries":["UK"],
    "tomatoes.boxOffice":{"$exists":True}
},
    {"tomatoes.boxOffice":1}
))
list_suffix=set(elem['tomatoes']['boxOffice'][0] for elem in result)
list_suffix


{'$'}

In [207]:
pipeline = [
    {
        "$addFields": {
            "boxOfficeUK": {
                "$cond": {
                    "if": {"$regexMatch": {"input": "$tomatoes.boxOffice", "regex": "^\\$[0-9]+(\\.[0-9]+)?M"}},
                    "then": {
                        "$toDouble": {
                            "$substr": [
                                "$tomatoes.boxOffice",
                                1,
                                {"$subtract": [{"$strLenBytes": "$tomatoes.boxOffice"}, 2]}
                            ]
                        }
                    },
                    "else": 0
                }
            }
        }
    },
    {
        "$match": {
            "countries":["UK"],
            "boxOfficeUK": {"$gt": 5}
        }
    },
    {
        "$project": {
            "_id": 0,
            "country": 1,
            "tomatoes.boxOffice":1
        }
    }
]

list(movies.aggregate(pipeline))

[{'tomatoes': {'boxOffice': '$11.5M'}},
 {'tomatoes': {'boxOffice': '$44.9M'}},
 {'tomatoes': {'boxOffice': '$8.0M'}},
 {'tomatoes': {'boxOffice': '$26.0M'}},
 {'tomatoes': {'boxOffice': '$17.4M'}},
 {'tomatoes': {'boxOffice': '$25.4M'}},
 {'tomatoes': {'boxOffice': '$39.7M'}},
 {'tomatoes': {'boxOffice': '$30.2M'}},
 {'tomatoes': {'boxOffice': '$9.0M'}},
 {'tomatoes': {'boxOffice': '$18.4M'}},
 {'tomatoes': {'boxOffice': '$6.4M'}},
 {'tomatoes': {'boxOffice': '$12.8M'}},
 {'tomatoes': {'boxOffice': '$15.0M'}},
 {'tomatoes': {'boxOffice': '$15.3M'}},
 {'tomatoes': {'boxOffice': '$10.7M'}},
 {'tomatoes': {'boxOffice': '$128.3M'}},
 {'tomatoes': {'boxOffice': '$35.9M'}}]