# Exercise sheet \#5
## Using MongoDB
### Exercise 1
For this exercise, you will work with the Paris Tourist Information dataset (see zip file on ARCHE).
This dataset contains pieces of information about seightseeing tours in Paris. These pieces are used to describes venues belonging to the following types:
- points of interests (POI)
- restaurants
- attractions
- accomodations

Here is an example of a document:
<pre>
{
   "_id" : 83292,
   "contact" : {
      "website" : "http://www.trocaderolatour.com",
      "GooglePlaces" : "https://plus.google.com/107754700607079935569/about?hl=en-US"
   },
   "name" : "Best Western Premier Trocadero La Tour",
   "location" : {
      "city" : "Paris",
      "coord" : {"coordinates" : [2.2795155644417,48.858311118724],"type" : "Point"},
      "address" : "Paris,   France    5 bis, rue Massenet, 16. Trocadéro - Passy, 75016 Paris"
   },
   "category" : "accommodation",
   "description" : " Situé à 15 minutes à pied de la tour Eiffel, le Best Western Premier Trocadero La Tour bénéficie d'un emplacement idéal pour découvrir Paris. Il abrite un bar lambrissé doté de fauteuils en cuir et un patio.",
   "services" : [
      "jardin",
      "terrasse",
      "journaux",
      "bar",
      "petit-déjeuner en chambre",
      "réception ouverte 24h 24",
      "enregistrement et règlement rapides",
      "bagagerie",
      "service d'étage",
      "salles de réunions banquets",
      "centre d'affaires",
      "garde d'enfants",
      "blanchisserie",
      "chambres non-fumeurs"
   ],
   "reviews" : [
      {
          "wordsCount" : 30,
          "rating" : 0,
          "language" : "en",
          "source" : "Foursquare",
          "text" : "Nice beds, rooms andstaff. Perfect central location. Breakfast is very expensive for a contenintal breakfast, however many bakeries and restaurants in the area. Will stay here again my next visit.",
          "time" : "2010-09-30"
      }
   ]
}
</pre>

#### Question 1.1 - Setting up the database
- Install a local MongoDB server on your machine, along with a [Robo3T](https://robomongo.org/) MongoDB client.
- Create a database named "tourPedia" containing a collection named "paris".
- Import the content of the `tour-Pedia_paris.json` file into that collection.

NB: For questions 1.2 to 1.5, please use the [Robo3T](https://robomongo.org/) graphical MongoDB client to design and check your queries.


#### Question 1.2 - Filtering and projecting data
- Filter out venues whose type is "accomodation" and service "blanchisserie" (laundry).
- Project addresses of venues whose type is accomodation.

#### Question 1.3 - Constrained filtering
- Filter out lists of reviews about venues for which there is at least one English review whose score is greater than 3.

#### Question 1.4 - Grouping data
- Group venues by type and count them.

#### Question 1.5 - Aggregating data
- For venues of type "accomodation", give the number of venues per "service".

### Exercise 2
For this exercise, we will reuse the data from Exercise 1.

In the following questions (which are similar to Exercise 1), you are required to use [pymongo](https://api.mongodb.com/python/current/api/pymongo/index.html).

#### Question 2.1 - Filtering and projecting data
- Filter out venues whose type is "accomodation" and service "blanchisserie" (laundry).
- Project addresses of venues whose type is accomodation.

Compare your results with those of question 1.2 above.

In [12]:
import pymongo
from pymongo import MongoClient
client=MongoClient('mongodb://localhost:27017')
# Verify the connection 
database_names = client.list_database_names()
for db_name in database_names:
    print(db_name)


admin
config
local
tourPedia


In [14]:
from pymongo import MongoClient


db = client['tourPedia']
paris_collection = db['paris']

# Query 1: Filtered Venues 
print("Filtered Venues:")
for venue in paris_collection.find(
    {
        "category": "accommodation",
        "services": "blanchisserie"
    }
):
    print(f"Name: {venue.get('name')}, Address: {venue.get('location', {}).get('address')}")

# Query 2: Projected Accommodation Venues 
print("\nProjected Accommodation Venues:")
for venue in paris_collection.find(
    {
        "category": "accommodation"
    },
    {
        "_id": 0,
        "name": 1,
        "location.address": 1
    }
):
    print(f"Name: {venue.get('name')}, Address: {venue.get('location', {}).get('address')}")


Filtered Venues:
Name: Arès Tour Eiffel, Address: Paris,   France    7 rue du Général de Larminat, 15. Eiffel Tower - Porte de Versailles, 75015 Paris
Name: Ampère, Address: Paris,   France    102 Avenue de Villiers, 17. Palais des Congrès - Batignolles, 75017 Paris
Name: Hôtel Bourgogne & Montana, Address: Paris,   France    3 rue de Bourgogne, 07. Invalides - Eiffel Tower, 75007 Paris
Name: Grand Hotel Francais, Address: Paris,   France    223 Boulevard Voltaire, 11. Bastille - République, 75011 Paris
Name: Best Western Premier Trocadero La Tour, Address: Paris,   France    5 bis, rue Massenet, 16. Trocadéro - Passy, 75016 Paris
Name: Marceau Champs-Elysées, Address: Paris,   France    37 Avenue Marceau, 16. Trocadéro - Passy, 75016 Paris
Name: Best Western Hôtel Victor Hugo, Address: Paris,   France    19 Rue Copernic, 16. Trocadéro - Passy, 75016 Paris
Name: Hotel Scribe Paris managed by Sofitel, Address: Paris,   France    1 Rue Scribe, 09. Opera - Haussmann, 75009 Paris
Name: Hôt

Name: Pavillon Des Oiseaux, Address: Paris, , France
Name: Place Vendôme, Address: Place Vendôme, Paris, 75001, France
Name: Pratic Hôtel, Address: 31 Rue Germain Pilon, Paris, France
Name: HOTEL DES ARTS MONTMARTRE - PARIS, Address: 5 rue Tholoze, Paris, 75018, France
Name: Restaurant Libanais Noura Opera, Address: Paris, , France
Name: HOTEL BUCI - 4**** Paris - Saint-Germain des Prés, Address: Hôtel de Buci- 22 rue de Buci , Paris, 75006, France
Name: Hotel Amour, Address: 8 Rue de Navarin, Paris, France
Name: Marriott Hotel, Address: 17 Boulevard Saint-Jacques, Paris, 75014, France
Name: Fiap, Address: 30 Rue Cabanis, Paris, 75014, France
Name: Hôtel Ibis La Fayette, Address: 122 Rue la Fayette, Paris, 75010, France
Name: Hôtel des Arènes, Address: 51 Rue Monge, Paris, France
Name: Hotel Merryl - Gare du Nord, Address: 7-9, rue Pajol, Paris, 75018, France
Name: Centre Paris, Address: 3 Rue René Boulanger, Paris, France
Name: Hôtel Reims (Paris), Address: 32 Rue d'Aubervilliers, Par

 #### Question 2.2 - Constrained filtering
- Filter out lists of reviews about venues for which there is at least one English review whose score is greater than 3.

Compare your results with those of question 1.3 above.

In [34]:

# Query: Filtered Reviews
print("Filtered Reviews:")
for venue in paris_collection.find(
    {
        'reviews': {
            '$elemMatch': {
                'language': 'en',
                'rating': {'$gt': 3}
            }
        }
    },
    {
        'name' : 1, 'reviews': 1, '_id': 0
    }
):
    print(f"Reviews of '{venue.get('name')}':")
    for review in venue.get('reviews', []):
        print(f"Review: {review}")
    print("\n")  


Filtered Reviews:
Reviews of 'Le Congrès Maillot':
Review: {'wordsCount': 27, 'rating': 0, 'language': 'en', 'details': 'http://tour-pedia.org/api/getReviewDetails?id=52a74a85ae9eef5a506719c0', 'source': 'Foursquare', 'text': 'Food is not bad, eating on the terrace is fine but... Expensive for what it is and the staff is really too Parisian (other word for unfriendly)', 'time': '2010-07-09', 'polarity': 5}
Review: {'wordsCount': 11, 'rating': 0, 'language': 'fr', 'details': 'http://tour-pedia.org/api/getReviewDetails?id=52a74a85ae9eef5a506719c1', 'source': 'Foursquare', 'text': 'Cuisine assez fine! Peux rien dire sur le prix! Était invité!', 'time': '2011-04-21', 'polarity': 10}
Review: {'wordsCount': 34, 'rating': 0, 'language': 'en', 'details': 'http://tour-pedia.org/api/getReviewDetails?id=52a74a85ae9eef5a506719c2', 'source': 'Foursquare', 'text': 'The waiting to get a seat is long and the waiting area only holds 4 people. This means if it is full you are stuck standing in the way w

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



#### Question 2.3 - Grouping data
- Group venues by type and count them.

Compare your results with those of question 1.4 above.

In [37]:
category_count = [
    {"$group": {"_id": "$category", "count": {"$sum": 1}}}
]

grouped_venues = paris_collection.aggregate(category_count)

# Affichage des résultats
print("Number of venues by category :")
for group in grouped_venues:
    print(f"Category: {group['_id']}, Count: {group['count']}")

Number of venues by category :
Category: attraction, Count: 4316
Category: accommodation, Count: 3376
Category: poi, Count: 26846
Category: restaurant, Count: 21823


#### Question 2.4 - Aggregating data
- For venues of type "accomodation", give the number of venues per "service".

Compare your results with those of question 1.5 above.

In [43]:
accomodation = [
    {"$match": {"category": "accommodation"}},  
    {"$unwind": "$services"},  
    {"$group": {"_id": "$services", "count": {"$sum": 1}}}  
]

service_counts = paris_collection.aggregate(accomodation)


print("Number of venues per service:")
for service in service_counts:
    print(f"Service: {service['_id']}, Count: {service['count']}")

Number of venues per service:
Service: parcours de golf (à moins de 3 km), Count: 3
Service: estonien, Count: 1
Service: piscine intérieure, Count: 15
Service: français, Count: 358
Service: distributeur automatique (collations), Count: 56
Service: enregistrement départ privé, Count: 24
Service: sauna, Count: 37
Service: chambres familiales, Count: 655
Service: service de concierge, Count: 380
Service: navette aéroport, Count: 173
Service:  italien, Count: 145
Service: grec, Count: 4
Service: slovène, Count: 1
Service: anglais, Count: 748
Service: livraison de courses, Count: 1
Service: discothèque, Count: 2
Service:  chinois, Count: 12
Service: salle de jeux, Count: 7
Service: petit-déjeuner en chambre, Count: 782
Service: bain turc à vapeur, Count: 43
Service: prêt de vélos, Count: 1
Service: service de repassage, Count: 290
Service: blanchisserie, Count: 616
Service: équipe d'animation, Count: 2
Service:  japonais, Count: 3
Service: piscine intérieure (toute l'année), Count: 8
Servic