<div style="line-height:0.5">
<h1 style="color:#2C5C0E"> Mongodb basics 1 </h1>
<h4> Setting the localhost connection and performing simple operations. </h4>
<span style="display: inline-block;">
    <h3 style="color: lightblue; display: inline;">Keywords:</h3> pymongo + json file
</span>
</div>
<br>
</div>

In [14]:
import json
import pymongo
from datetime import datetime
from bson.objectid import ObjectId

<h3 style="color:#2C5C0E  "> Recap: </h3>
<div style="margin-top: -8px;">

In MongoDB, a collection is a database component used to store multiple documents. <br>
Documents in a collection can be thought of as similar to rows or records in a traditional relational database. <br>

MongoDB is: <br>
=> schema-less, meaning that each document in a collection can have a different structure. <br> 
=> flexible; It is possible to store data with varying fields and data types within the same collection. <br>
=> without fixed keys! (unlike dictionaries). <br>
=> horizontal scaling 
</div>

<h3 style="color:#2C5C0E  "> Instructions (Linux debian): </h3>

In [116]:
%%script echo Skipping, since already extracted.
# Install MongoBD GUI program (deb package)
sudo dpkg -i mongodb-compass_1.40.4_amd64.deb

Skipping, since already extracted.


In [117]:
%%script echo Skipping, since already extracted.
## Extract downloaded archive and make the program globally executable
!tar -zxvf mongodb-linux-x86_64-ubuntu2204-7.0.2.tgz
!sudo mv mongodb-linux-x86_64-ubuntu2204-7.0.2/bin/mongod /usr/local/bin/

Skipping, since already extracted.


In [118]:
%%script echo Skipping. Should be done only the first time, after installation
## Create a database directory and grant permissions
!sudo mkdir -p /data/db/
!sudo chown -R $USER /data/db/

Skipping. Should be done only the first time, after installation


In [119]:
%%script echo Skipping. Clearly, the server must be run in an external terminal and not here, to avoid stalling the notebook execution. 
# Start a MongoDB server on localhost (it can be specified a different path with: "$mongod --dbpath /path/to/db)"
!mongod

Skipping. Clearly, the server must be run in an external terminal and not here, to avoid stalling the notebook execution.


Before starting, delete the old database from MongoDB Compass application or via terminal launching both '$mongod' and '$mongosh' in 2 separated terminals. <br>

mongosh shell commands:
<div style="margin-top: -10px;">

- test > show dbs
- test > use music_management
- music_management> db.dropDatabase()
- music_management> show dbs
- music_management> exit
</div>

Note that the directory associated with the removed db may not be immediately removed from the file system. music_management>

In [120]:
""" Connect to the MongoDB server.
N.B.
Run locally on the default port (27017)
#client = pymongo.MongoClient("mongodb://localhost:27017/")
"""
client = pymongo.MongoClient("mongodb://127.0.0.1:27017/")

In [121]:
# Create or switch to a database for the music management system
db = client["music_archive_management"]

In [4]:
##### Create collections
bands_collection = db["bands"]
venues_collection = db["venues"]
contracts_collection = db["contracts"]
sponsors_collection = db["sponsors"]
albums_collection = db["albums"]

<h3 style="color:#2C5C0E  "> Document definitions </h3>

In [123]:
""" Documents for bands 1 """
girlschool = {
    "name": "Girlschool",
    "members": [
        {"name": "Denise", "role": "drums"},
        {"name": "Kim", "role": "lead vocals, guitars"},
        {"name": "Jackie", "role": "guitars"},
        {"name": "Tracey", "role": "bass"}
    ],
    "genres": ["Heavy metal", "Hard rock", "NWOBHM"],
    "origin": "London",
    "num_released_lp": 14,
    "studio_albums": [
        {"year": 1980, "title": "Demolition"},
        {"year": 1981, "title": "Hit and Run"},
        {"year": 1982, "title": "Screaming Blue Murder"},
        {"year": 1983, "title": "Play Dirty"},
        {"year": 1985, "title": "Running Wild"},
        {"year": 1986, "title": "Nightmare at Maple Cross"},
        {"year": 1988, "title": "Take a Bite"},
        {"year": 1993, "title": "Girlschool"},
        {"year": 2002, "title": "21st Anniversary: Not That Innocent"},
        {"year": 2004, "title": "Believe"},
        {"year": 2008, "title": "Legacy"},
        {"year": 2011, "title": "Hit and Run - Revisited"},
        {"year": 2015, "title": "Guilty as Sin"},
        {"year": 2023, "title": "WTFortyfive?"}
    ],
    "labels": ["City", "Bronze", "Mercury"], 
    "founded_year": 1977,
}
type(girlschool)

dict

In [124]:
""" Documents for bands 2 
N.B 
The attributes can be different even if they are going to be collected in the same collection!!!
"""
l7 = {
    "name": "L7",
    "members": [
        {"name": "Donita", "role": "lead vocals, guitars"},
        {"name": "Suzi", "role": "guitars, vocals"},
        {"name": "Jennifer", "role": "guitars, vocals"},
        {"name": "Demetra", "role": "bass, vocals"}
    ],
    "genres": ["Grunge", "Punk Rock"],
    "origin": "Los Angeles",
    "num_released_lp": 12,
    "studio_albums": [
        {"year": 1988, "title": "L7"},
        {"year": 1988, "title": "Smell the magic"},
        {"year": 1991, "title": "Bricks are Heavy"},
        {"year": 1994, "title": "Hungry for Stink"},
        {"year": 1996, "title": "The Beauty Process: Triple Platinum"},
        {"year": 1999, "title": "Slap-Happy"},
        {"year": 2016, "title": "Wireless"},
        {"year": 2018, "title": "Detroit"},
        {"year": 2019, "title": "Scatter the Rats"},
    ],
    "labels": ["Epitaph", "Sub Pop", "Reprise"], 
    "foundation_year": 1985,    
}

lunachicks = {
    "name": "Lunachicks",
    "members": [
        {"name": "Theoi", "role": "lead vocals"},
        {"name": "Gina", "role": "guitar, vocals"},
        {"name": "Sydney", "role": "bass, vocals"},
        {"name": "Sindi", "role": "guitar"},
        {"name": "Chip", "role": "drums"}
    ],
    "genres": ["Punk rock", "Indie rock"],
    "origin": "New York",
    "num_released_lp": 6,
    "labels": ["Blast First", "Safe House", "Go Kart"], 
    "foundation_yearear": 1988,
}

novatwins = {
    "name": "Nova Twins",
    "members": [
        {"name": "Amy", "role": "vocals, guitar"},
        {"name": "Georgia", "role": "vocals, bass"},
    ],
    "genres": ["Rap Instrumental", "Alternative rock", "Nu Metal"],
    "origin": "London",
    "num_released_lp": 2,
    "studio_albums": [
        {"year": 2020, "title": "Who are the girls?"},
        {"year": 2022, "title": "Supernova"},
    ],
    "labels": ["Marshall"], 
    "foundation_year": 2014,
}

scowl = {
    "name": "Nova Twins",
    "members": [
        {"name": "Kat", "role": "vocals"},
        {"name": "Malachi", "role": "guitars"},
        {"name": "Mikey", "role": "guitars"},
        {"name": "Bailey", "role": "bass"},
        {"name": "Cole", "role": "drums"},
    ],
    "genres": ["Hardcore punk"],
    "origin": "Santa Cruz",
    "num_released_lp": 1,
    "labels": ["Flatspot"], 
    "foundation_year": 2019,
}

nodoubt = {
    "name": "No Doubt",
    "members": [
        {"name": "Gnew", "role": "vocals"},
        {"name": "Tony", "role": "bass"},
        {"name": "Tom", "role": "guitars, keyboards"},
        {"name": "Adrian", "role": "drums, percussions"},
        {"name": "Stephen", "role": "Trombone"},
        {"name": "Gabrial", "role": "Trumpet, keyboards"},
    ],
    "genres": ["Pop rock", "Ska"],
    "origin": "Anaheim",
    "num_released_lp": 6,
    "labels": ["Interscope"], 
    "foundation_year": 1987,
}

In [125]:
""" venues"""
stadium_1_venue = {
    "name": "Metropolitain 2",
    "location": "Lyon",
    "capacity": 36000,
    "sold_out": "yes",    
}
stadium_2_venue = {
    "name": "Angel Park",
    "location": "Glasgow",
    "capacity": 31000,
    "sold_out": "yes",    
}
stadium_3_venue = {
    "name": "Invincibles",
    "location": "Toronto",
    "capacity": 40000,
    "sold_out": "yes",    
}
music_arena_1_venue = {
    "name": "TheMusic Bay",
    "location": "San Francisco",
    "capacity": 22000,
    "sold_out": "yes",    
}
music_arena_2_venue = {
    "name": "gute Musik",
    "location": "Berlin",
    "capacity": 13000,
    "sold_out": "yes",    
}
music_arena_3_venue = {
    "name": "Festival only_the_best",
    "location": "Rotterdam",
    "capacity": 9800,
    "sold_out": "yes",    
}
music_arena_4_venue = {
    "name": "Champions stars",
    "location": "Liverpool",
    "capacity": 5044,
    "sold_out": "yes",    
}

In [126]:
""" Albums """
album_afi_4 = {
    "title": "Black Sails In The Sunset",
    "artist": "Afi",
    "label": "Nitro",
    "year": 2000,
}
album_rhcp_9 = {
    "title": "Stadium Arcadium",
    "artist": "Red Hot Chili Peppers",
    "label": "Warner",
    "year": 2006,
}
album_billy_2 = {
    "title": "Billy Talent 2",
    "artist": "Billy Talent",
    "label": "Atlantic",
    "year": 2005,
}
album_cigar_1 = {
    "title": "Speed is relative",
    "artist": "Cigar",
    "label": "Warner",
    "year": 1999,
}
album_nodoubt_3 = {
    "title": "Tragic Kingdom",
    "artist": "No doubt",
    "label": "Interscope",
    "year": 1995,
}
album_teenidols_3 = {
    "title": "Full Leather Jacket",
    "artist": "Teen idols",
    "label": "Warner",
    "year": 2000,
}

In [127]:
""" sponsors """
sponsor_docs = [
    {
        "name": "Acme Guitars",
        "industry": "Musical Instruments",
        "type": "for bassist only",
        "combined offer": "strings",
    },
    {
        "name": "Batti Pedals",
        "industry": "Musical Instruments",
        "contract": "Ongoing",
    },
    {
        "name": "Green park",
        "industry": "Clothing",
    },
    {
        "name": "San Pat",
        "category": "Beer",
        "covered_years": 3,
    },
]

In [128]:
""" contracts"""
contract_1 = {
    "code": "con_1",
    "band_id": None,            
    "venue_id": None,  
    "start_date": datetime(2023, 12, 1),
    "end_date": datetime(2023, 12, 31),
    "payment": 2200013,
}
contract_2 = {
    "code": "con_2",
    "band_id": None,            
    "venue_id": None,  
    "start_date": datetime(2020, 3, 1),
    "end_date": datetime(2020, 3, 4),
    "payment": 6790,
}
contract_3 = {
    "code": "con_3",
    "band_id": 321,            
    "venue_id": 44,  
    "start_date": datetime(199, 8, 12),
    "end_date": datetime(1999, 8, 13),
    "payment": 31000,
}
contract_4 = {
    "code": "con_4",
    "band_id": 103,            
    "venue_id": "1AA87K43",  
    "start_date": datetime(2021, 3, 1),
    "end_date": datetime(2021, 5, 2),
    "payment": 52000,
}

In [129]:
# Note that the database isn't actually created until data are insert into it! (check also in mongosh typing: 'show dbs')
print(client.list_database_names())

['admin', 'config', 'local']


<h3 style="color:#2C5C0E  "> Data insertion </h3>

In [130]:
## Insert bands into respective collection
# "inserted_id" returns the inserted IDs that are unique identifiers generated by MongoDB
band_id1 = bands_collection.insert_one(girlschool).inserted_id
bands_collection.insert_one(nodoubt).inserted_id

## Insert many bands
band_documents = [lunachicks, l7, novatwins]
inserted_ids = []
for doc in band_documents:
    result = bands_collection.insert_one(doc)
    inserted_ids.append(result.inserted_id)

# Show the IDs of the last inserted docs
print(f"Inserted {len(inserted_ids)} band documents with IDs: {inserted_ids}")

Inserted 3 band documents with IDs: [ObjectId('653c1744a62e95c90d145ffe'), ObjectId('653c1744a62e95c90d145fff'), ObjectId('653c1744a62e95c90d146000')]


In [131]:
print(client.list_database_names())

['admin', 'config', 'local', 'music_archive_management']


In [132]:
bands_collection

Collection(Database(MongoClient(host=['127.0.0.1:27017'], document_class=dict, tz_aware=False, connect=True), 'music_archive_management'), 'bands')

In [133]:
b_count = bands_collection.count_documents({})
b_count

5

For each document in a collection, an "_id" field for the identifier is automatically generated by MongoDB, unless another _id value is specified.

In [134]:
band_with_custom_id = {
    "name": "LNRipley",
    "genre": "Drum n bass", 
    "_id": "AAA001",
    "origin": "Torino",
}
result = bands_collection.insert_one(band_with_custom_id)

<h3 style="color:#2C5C0E  "> Recap: </h3>
<div style="margin-top: -10px;">

A collection can be filled wih or without assignment. <br>
Without assigning the inserted IDs to a variable, IDs will be still generated and returned by the 'inserted_id' attr, <br>
but they are not stored, and hence not accessible later. <br>
When there is no need to access to the inserted IDs in the code, it is possible to simply insert documents without calling 'insert_id' / 'insert_ids'.  <br> 
Pay attention that if a specific ID is not stored at the moment of the insertion the ID is lost, since not automatically stored! <br>
</div>

In [135]:
album_id_1a = albums_collection.insert_one(album_cigar_1).inserted_id

In [136]:
""" Note that launching again:
"album_id_1a = albums_collection.insert_one(album_cigar_1).insterted_id" will not work, 
since MongoDB is aware that there is another doc with same ID. 
The DuplicateKeyError will be triggered: "E11000 duplicate key error collection: music_archive_management.albums index: _id_ dup key ..."
""";

In [137]:
albums_collection.insert_one(album_afi_4).inserted_id
albums_collection.insert_one(album_rhcp_9)
albums_collection.insert_one(album_billy_2)
album_id_5a = albums_collection.insert_one(album_nodoubt_3).inserted_id
album_id_6a = albums_collection.insert_one(album_teenidols_3).inserted_id

In [138]:
album_id_1a, album_id_5a

(ObjectId('653c174da62e95c90d146001'), ObjectId('653c174fa62e95c90d146005'))

In [139]:
""" Insert venues"""
venue_id_s_1 = venues_collection.insert_one(stadium_1_venue).inserted_id

In [140]:
""" Insert venues all together """
new_venues_club = [
    {
    "name": "BFFriends",
    "location": "Glasgow",
    "capacity": 670,
    "sold_out": "yes",  
    },
    {
    "name": "The ring pub",
    "location": "Manchester",
    "capacity": 230,
    "sold_out": "yes",
    },
    {
    "name": "Musicorium",
    "location": "Rome",
    "capacity": 340,
    "sold_out": "yes",
    },
    {
    "name": "GBNA",
    "location": "New York",
    "capacity": 100,
    "sold_out": "yes", 
    },
]
# Retrieve the existing venues from the collection
existing_venues = list(venues_collection.find())

In [141]:
existing_venues

[{'_id': ObjectId('653c1750a62e95c90d146007'),
  'name': 'Metropolitain 2',
  'location': 'Lyon',
  'capacity': 36000,
  'sold_out': 'yes'}]

In [142]:
# Combine the existing and new venue documents into a single list
all_venues = existing_venues + new_venues_club
all_venues

[{'_id': ObjectId('653c1750a62e95c90d146007'),
  'name': 'Metropolitain 2',
  'location': 'Lyon',
  'capacity': 36000,
  'sold_out': 'yes'},
 {'name': 'BFFriends',
  'location': 'Glasgow',
  'capacity': 670,
  'sold_out': 'yes'},
 {'name': 'The ring pub',
  'location': 'Manchester',
  'capacity': 230,
  'sold_out': 'yes'},
 {'name': 'Musicorium',
  'location': 'Rome',
  'capacity': 340,
  'sold_out': 'yes'},
 {'name': 'GBNA', 'location': 'New York', 'capacity': 100, 'sold_out': 'yes'}]

In [143]:
venues = all_venues[1:]
venues

[{'name': 'BFFriends',
  'location': 'Glasgow',
  'capacity': 670,
  'sold_out': 'yes'},
 {'name': 'The ring pub',
  'location': 'Manchester',
  'capacity': 230,
  'sold_out': 'yes'},
 {'name': 'Musicorium',
  'location': 'Rome',
  'capacity': 340,
  'sold_out': 'yes'},
 {'name': 'GBNA', 'location': 'New York', 'capacity': 100, 'sold_out': 'yes'}]

In [144]:
# Insert all venue documents into the venues_collection
venues_alls_ids = venues_collection.insert_many(venues)
venues_alls_ids

<pymongo.results.InsertManyResult at 0x7f2a362d6fe0>

In [145]:
## Update a contract
contract_1["band_id"] = band_id1
contract_1["venue_id"] = venue_id_s_1

In [146]:
############ Insert contract and store their ID
all_contracts = []
contract_id_1 = contracts_collection.insert_one(contract_1).inserted_id
contract_id_2 = contracts_collection.insert_one(contract_2).inserted_id
contract_id_3 = contracts_collection.insert_one(contract_3).inserted_id
contract_id_4 = contracts_collection.insert_one(contract_4).inserted_id
all_contracts.append(contract_id_1)
all_contracts.append(contract_id_2)
all_contracts.append(contract_id_3)
all_contracts.append(contract_id_4)

print(contract_id_4)
print()
print(all_contracts)

653c1756a62e95c90d14600f

[ObjectId('653c1756a62e95c90d14600c'), ObjectId('653c1756a62e95c90d14600d'), ObjectId('653c1756a62e95c90d14600e'), ObjectId('653c1756a62e95c90d14600f')]


In [147]:
# Insert many at the time
sponsors_collection.insert_many(sponsor_docs)

<pymongo.results.InsertManyResult at 0x7f2a58cd97e0>

In [148]:
### Update a document and the relative collection
update_query = {"name": "Musicorium"}
the_new_values = {"$set": {"capacity": 500}}
venues_collection.update_one(update_query, the_new_values)

<pymongo.results.UpdateResult at 0x7f2a35f7d7b0>

In [149]:
####### Add new attributes
add_query = {"code": "con_4"}
new_values = {
    "$set": {
        "people_involved": 10,
        "typology": "temporary" 
    }
}
contracts_collection.update_one(update_query, new_values);  # avoid printing the output!

<h3 style="color:#2C5C0E  "> Query </h3>

In [150]:
cur = bands_collection.find()
for doc in cur:
    print(doc)

{'_id': ObjectId('653c1744a62e95c90d145ffc'), 'name': 'Girlschool', 'members': [{'name': 'Denise', 'role': 'drums'}, {'name': 'Kim', 'role': 'lead vocals, guitars'}, {'name': 'Jackie', 'role': 'guitars'}, {'name': 'Tracey', 'role': 'bass'}], 'genres': ['Heavy metal', 'Hard rock', 'NWOBHM'], 'origin': 'London', 'num_released_lp': 14, 'studio_albums': [{'year': 1980, 'title': 'Demolition'}, {'year': 1981, 'title': 'Hit and Run'}, {'year': 1982, 'title': 'Screaming Blue Murder'}, {'year': 1983, 'title': 'Play Dirty'}, {'year': 1985, 'title': 'Running Wild'}, {'year': 1986, 'title': 'Nightmare at Maple Cross'}, {'year': 1988, 'title': 'Take a Bite'}, {'year': 1993, 'title': 'Girlschool'}, {'year': 2002, 'title': '21st Anniversary: Not That Innocent'}, {'year': 2004, 'title': 'Believe'}, {'year': 2008, 'title': 'Legacy'}, {'year': 2011, 'title': 'Hit and Run - Revisited'}, {'year': 2015, 'title': 'Guilty as Sin'}, {'year': 2023, 'title': 'WTFortyfive?'}], 'labels': ['City', 'Bronze', 'Mercu

In [151]:
cur = contracts_collection.find()
for doc in cur:
    print(doc)

{'_id': ObjectId('653c1756a62e95c90d14600c'), 'code': 'con_1', 'band_id': ObjectId('653c1744a62e95c90d145ffc'), 'venue_id': ObjectId('653c1750a62e95c90d146007'), 'start_date': datetime.datetime(2023, 12, 1, 0, 0), 'end_date': datetime.datetime(2023, 12, 31, 0, 0), 'payment': 2200013}
{'_id': ObjectId('653c1756a62e95c90d14600d'), 'code': 'con_2', 'band_id': None, 'venue_id': None, 'start_date': datetime.datetime(2020, 3, 1, 0, 0), 'end_date': datetime.datetime(2020, 3, 4, 0, 0), 'payment': 6790}
{'_id': ObjectId('653c1756a62e95c90d14600e'), 'code': 'con_3', 'band_id': 321, 'venue_id': 44, 'start_date': datetime.datetime(199, 8, 12, 0, 0), 'end_date': datetime.datetime(1999, 8, 13, 0, 0), 'payment': 31000}
{'_id': ObjectId('653c1756a62e95c90d14600f'), 'code': 'con_4', 'band_id': 103, 'venue_id': '1AA87K43', 'start_date': datetime.datetime(2021, 3, 1, 0, 0), 'end_date': datetime.datetime(2021, 5, 2, 0, 0), 'payment': 52000}


In [152]:
cur = venues_collection.find()
for doc in cur:
    print(doc)

{'_id': ObjectId('653c1750a62e95c90d146007'), 'name': 'Metropolitain 2', 'location': 'Lyon', 'capacity': 36000, 'sold_out': 'yes'}
{'_id': ObjectId('653c1755a62e95c90d146008'), 'name': 'BFFriends', 'location': 'Glasgow', 'capacity': 670, 'sold_out': 'yes'}
{'_id': ObjectId('653c1755a62e95c90d146009'), 'name': 'The ring pub', 'location': 'Manchester', 'capacity': 230, 'sold_out': 'yes'}
{'_id': ObjectId('653c1755a62e95c90d14600a'), 'name': 'Musicorium', 'location': 'Rome', 'capacity': 500, 'sold_out': 'yes'}
{'_id': ObjectId('653c1755a62e95c90d14600b'), 'name': 'GBNA', 'location': 'New York', 'capacity': 100, 'sold_out': 'yes'}


In [153]:
cur = albums_collection.find()
for doc in cur:
    print(doc)

{'_id': ObjectId('653c174da62e95c90d146001'), 'title': 'Speed is relative', 'artist': 'Cigar', 'label': 'Warner', 'year': 1999}
{'_id': ObjectId('653c174fa62e95c90d146002'), 'title': 'Black Sails In The Sunset', 'artist': 'Afi', 'label': 'Nitro', 'year': 2000}
{'_id': ObjectId('653c174fa62e95c90d146003'), 'title': 'Stadium Arcadium', 'artist': 'Red Hot Chili Peppers', 'label': 'Warner', 'year': 2006}
{'_id': ObjectId('653c174fa62e95c90d146004'), 'title': 'Billy Talent 2', 'artist': 'Billy Talent', 'label': 'Atlantic', 'year': 2005}
{'_id': ObjectId('653c174fa62e95c90d146005'), 'title': 'Tragic Kingdom', 'artist': 'No doubt', 'label': 'Interscope', 'year': 1995}
{'_id': ObjectId('653c174fa62e95c90d146006'), 'title': 'Full Leather Jacket', 'artist': 'Teen idols', 'label': 'Warner', 'year': 2000}


In [154]:
cur = sponsors_collection.find()
for doc in cur:
    print(doc)

{'_id': ObjectId('653c1757a62e95c90d146010'), 'name': 'Acme Guitars', 'industry': 'Musical Instruments', 'type': 'for bassist only', 'combined offer': 'strings'}
{'_id': ObjectId('653c1757a62e95c90d146011'), 'name': 'Batti Pedals', 'industry': 'Musical Instruments', 'contract': 'Ongoing'}
{'_id': ObjectId('653c1757a62e95c90d146012'), 'name': 'Green park', 'industry': 'Clothing'}
{'_id': ObjectId('653c1757a62e95c90d146013'), 'name': 'San Pat', 'category': 'Beer', 'covered_years': 3}


In [155]:
""" Show all collections """
print("Bands:")
for band in bands_collection.find():
    print(band)
print("\nAlbums:")
for alb in albums_collection.find():
    print(alb)    
print("\nVenues:")
for ven in venues_collection.find():
    print(ven)
print("\nContracts:")
for cont in contracts_collection.find():
    print(cont)
print("\nSponsors:")
for spon in sponsors_collection.find():
    print(spon)

Bands:
{'_id': ObjectId('653c1744a62e95c90d145ffc'), 'name': 'Girlschool', 'members': [{'name': 'Denise', 'role': 'drums'}, {'name': 'Kim', 'role': 'lead vocals, guitars'}, {'name': 'Jackie', 'role': 'guitars'}, {'name': 'Tracey', 'role': 'bass'}], 'genres': ['Heavy metal', 'Hard rock', 'NWOBHM'], 'origin': 'London', 'num_released_lp': 14, 'studio_albums': [{'year': 1980, 'title': 'Demolition'}, {'year': 1981, 'title': 'Hit and Run'}, {'year': 1982, 'title': 'Screaming Blue Murder'}, {'year': 1983, 'title': 'Play Dirty'}, {'year': 1985, 'title': 'Running Wild'}, {'year': 1986, 'title': 'Nightmare at Maple Cross'}, {'year': 1988, 'title': 'Take a Bite'}, {'year': 1993, 'title': 'Girlschool'}, {'year': 2002, 'title': '21st Anniversary: Not That Innocent'}, {'year': 2004, 'title': 'Believe'}, {'year': 2008, 'title': 'Legacy'}, {'year': 2011, 'title': 'Hit and Run - Revisited'}, {'year': 2015, 'title': 'Guilty as Sin'}, {'year': 2023, 'title': 'WTFortyfive?'}], 'labels': ['City', 'Bronze',

<h3 style="color:#2C5C0E  "> => Searching </h3>

In [156]:
# Find a document by a field value
print("Find a band by name:")
found_band = bands_collection.find_one({"name": "L7"})
print(found_band)

Find a band by name:
{'_id': ObjectId('653c1744a62e95c90d145fff'), 'name': 'L7', 'members': [{'name': 'Donita', 'role': 'lead vocals, guitars'}, {'name': 'Suzi', 'role': 'guitars, vocals'}, {'name': 'Jennifer', 'role': 'guitars, vocals'}, {'name': 'Demetra', 'role': 'bass, vocals'}], 'genres': ['Grunge', 'Punk Rock'], 'origin': 'Los Angeles', 'num_released_lp': 12, 'studio_albums': [{'year': 1988, 'title': 'L7'}, {'year': 1988, 'title': 'Smell the magic'}, {'year': 1991, 'title': 'Bricks are Heavy'}, {'year': 1994, 'title': 'Hungry for Stink'}, {'year': 1996, 'title': 'The Beauty Process: Triple Platinum'}, {'year': 1999, 'title': 'Slap-Happy'}, {'year': 2016, 'title': 'Wireless'}, {'year': 2018, 'title': 'Detroit'}, {'year': 2019, 'title': 'Scatter the Rats'}], 'labels': ['Epitaph', 'Sub Pop', 'Reprise'], 'foundation_year': 1985}


In [157]:
db.venues_collection.find({"capacity": 500})

<pymongo.cursor.Cursor at 0x7f2a358d59c0>

In [158]:
#To retrieve specific fields from the result of a MongoDB query, using .find() it is not enough
db.venues_collection.find({ "capacity": 500 }, { "_id": 0, "name": 1 })

<pymongo.cursor.Cursor at 0x7f2a358d4250>

In [159]:
cursor_0 = db.albums.find({ "year": 2000 })
for doc in cursor_0:
    print(doc["label"])

Nitro
Warner


In [160]:
""" Get a partial match for title.
N.B.
the field " _id": 0 is used to exclude the _id field from the query result. 
"""
cursor_1 = db.albums.find({ "year": 2000 }, { "_id": 0, "title": 'Black' }) 
for document in cursor_1:
    print(document["title"])

cursor_1

Black
Black


<pymongo.cursor.Cursor at 0x7f2a358d63e0>

In [161]:
cursor_2 = db.albums.find({ "year": 2000, "title": 'Black Sails In The Sunset' }, { "_id": 0, "title": 1 })
for document in cursor_2:
    print(document["title"])

Black Sails In The Sunset


In [162]:
sponsors_collection

Collection(Database(MongoClient(host=['127.0.0.1:27017'], document_class=dict, tz_aware=False, connect=True), 'music_archive_management'), 'sponsors')

<h3 style="color:#2C5C0E  "> => Cancel </h3>

In [163]:
# Delete a document
delete_query = {"name": "Acme Guitars"}
sponsors_collection.delete_one(delete_query)

<pymongo.results.DeleteResult at 0x7f2a58cd8310>

In [165]:
cur = sponsors_collection.find()
for doc in cur:
    print(doc)

{'_id': ObjectId('653c1757a62e95c90d146011'), 'name': 'Batti Pedals', 'industry': 'Musical Instruments', 'contract': 'Ongoing'}
{'_id': ObjectId('653c1757a62e95c90d146012'), 'name': 'Green park', 'industry': 'Clothing'}
{'_id': ObjectId('653c1757a62e95c90d146013'), 'name': 'San Pat', 'category': 'Beer', 'covered_years': 3}


In [166]:
# Empty a collection (equivalent of mongosh => db.venues_collection.deleteMany({}))
venues_collection.delete_many({})

<pymongo.results.DeleteResult at 0x7f2a58cda5f0>

In [167]:
# Check that the venues_collection does not exist anymore
cursor = venues_collection.find()
for doc in cursor:
    print(doc)

In [168]:
# Delete the entire collection (equivalent of mongosh => db.venues_collection.drop())
venues_collection.drop()

In [169]:
venues_collection

Collection(Database(MongoClient(host=['127.0.0.1:27017'], document_class=dict, tz_aware=False, connect=True), 'music_archive_management'), 'venues')

In [170]:
# Close the MongoDB connection
client.close()

In [2]:
# Reopen 
client = pymongo.MongoClient("mongodb://localhost:27017/")
db = client["music_archive_management"]

In [5]:
the_who_document = {
    "name": "The Who",
    "formed": datetime(1964, 1, 1),
    "origin": "London, England",
    "genres": ["Rock", "Hard Rock", "Power Pop"],
    "members": [
        {
            "name": "Roger Daltrey",
            "role": "lead vocals",
            "details": {
                "born": datetime(1944, 3, 1),
                "instruments": ["Vocals", "Harmonica", "Tambourine"]
            }
        },
        {
            "name": "Pete Townshend",
            "role": "guitar",
            "details": {
                "born": datetime(1945, 5, 19),
                "instruments": ["Guitar", "Vocals", "Keyboards"]
            }
        },
        {
            "name": "John Entwistle",
            "role": "bass",
            "details": {
                "born": datetime(1944, 10, 9),
                "instruments": ["Bass Guitar", "Brass", "Vocals"]
            }
        },
        {
            "name": "Keith Moon",
            "role": "drums",
            "details": {
                "born": datetime(1946, 8, 23),
                "instruments": ["Drums", "Percussion"]
            }
        }
    ],
    "discography": [
        {"album": "My Generation", "released": datetime(1965, 1, 1)},
        {"album": "A Quick One", "released": datetime(1966, 1, 1)},
        {"album": "The Who Sell Out", "released": datetime(1967, 1, 1)},
        {"album": "Tommy", "released": datetime(1969, 1, 1)},
        {"album": "Who's Next", "released": datetime(1971, 1, 1)},
        {"album": "Quadrophenia", "released": datetime(1973, 1, 1)},
        {"album": "The Who by Numbers", "released": datetime(1975, 1, 1)},
        {"album": "Who Are You", "released": datetime(1978, 1, 1)},
        {"album": "Face Dances", "released": datetime(1981, 1, 1)},
        {"album": "It's Hard", "released": datetime(1982, 1, 1)}
    ]
}

# Insert finally the doc into the right collection
bands_collection.insert_one(the_who_document)

<pymongo.results.InsertOneResult at 0x7f3840b60d00>

In [6]:
""" Retrieve the ObjectId of a document """

band_name = "The Who"  
query_who = {"name": band_name}
band_document = bands_collection.find_one(query_who)

##### Extract the ObjectId
if band_document:
    band_id = band_document["_id"]
    print(f"The ObjectId of '{band_name}' is: {band_id}")
else:
    print(f"No band found with the name '{band_name}'")

The ObjectId of 'The Who' is: 65ba12d0bb6e81da898d51cf


In [8]:
""" How to simply reference other documents ? Here a sort of relation is introduced! 

#new_id = ObjectId()
Create a new ObjectId to use an existing mongoDB ObjectId string 
existing_id = ObjectId("65ba12d0bb6e81da898d51cf")
OR...
"""
existing_id = ObjectId(band_id)

# Create and insert together directly 
venues_collection.insert_one(
{
    "timestamp": datetime.now(),
    "band": existing_id,
    "event": "Live Concert",
    "location": "New York"
}
)

<pymongo.results.InsertOneResult at 0x7f38409f0d60>

<h3 style="color:#2C5C0E  "> Notes: </h3>
<div style="margin-top: -8px;">
MongoDB itself does not enforce referential integrity or automatic linking like relational databases do with foreign keys. <br>
When looking at a document with an ObjectId reference In MongoDB Compass, it will appear as just an ID string. <br>
</div> 


In [10]:
live_stadium_tours_2000 = {
    "name": "No Doubt",
    "date": datetime(1999, 3, 6),
    "location": "Milano",
    "stadium": "San Siro",
    "Organizator": "Flip Music Factory"
}
venues_collection.insert_one(live_stadium_tours_2000)

<pymongo.results.InsertOneResult at 0x7f38409f14b0>

In [11]:
""" Aggregation Pipeline: A way to create a "fake" reference in MongoDB is using the aggregation framework.
The $lookup stage, which allows you to perform a join-like operation.
"""
pipeline = [
    {
        "$lookup": {
            "from": "bands_collection",         # The other collection to join
            "localField": "name",               # Field that contain the band name in the venues
            "foreignField": "name",             # Field from the bands collection to match
            "as": "band_info_array_field"       # The output array field
        }
    }
]

results = db.venues_collection.aggregate(pipeline)

for doc in results:
    print(doc)

In [12]:
""" Retrieve the last document created in a collection
N.B.
Sorting by "_id" in descending order effectively sorts documents by their creation time, 
since MongoDB's default ObjectId generation strategy incorporates a timestamp
"""
last_document = venues_collection.find().sort("_id", -1).limit(1).next()
# Extract the ObjectId
last_venue_doc = last_document.get("_id")
print("The _id of the last document created:", last_venue_doc)

The _id of the last document created: 65ba1986bb6e81da898d51d1


In [13]:
""" Update a document """
venues_collection.update_one(
    {"_id": last_venue_doc},
    {
        # Rename field
        "$rename": {"name": "bandName"},        
        # Add a new field
        "$set": {"status": "Sold out"}       
    }
)
print("The document was successfully updated.")

The document was successfully updated.


In [17]:
rigs_collection = db["rigs"]
with open('rigs.json', 'r') as file:
    rigs_data = json.load(file)

rigs_collection.insert_many(rigs_data)
print("Your rigs were inserted")

Your rigs were inserted
