# Create My Own Sound Collection

In this notebook, we will build our own collection of sound fonts by querying the Freesound API. We will use modular code stored in the **src** folder to:
- Query Freesound for sounds matching our desired keywords.
- Download high-quality previews.
- Store relevant metadata in a CSV file.

This approach follows the structure of the original Notebook 1 from the AMPLAB module, our implementation is modular.



## Environment Setup and Imports

This cell confirms that our configuration (API key, file paths, etc.) is loaded correctly from src/config.py and that our functions for querying Freesound are accessible.

In [1]:
# Ensure that your virtual environment is activated and dependencies are installed (see requirements.txt).

# Import configuration settings and utility modules from src
import os
import pandas as pd
import sys

# Añadir la ruta absoluta del directorio raíz del proyecto
sys.path.append(os.path.abspath(".."))

# Our configuration file contains API keys, file paths, etc.
from src import config  
# Our sound_collection module includes functions to query Freesound and process sound records.
from src.sound_collection import query_freesound, download_sound_preview, make_metadata_record

# Display configuration to confirm settings
print("Freesound API Key:", config.FREESOUND_API_KEY)
print("Files Directory:", config.RAW_DIR)
print("Metadata CSV File:", config.DATAFRAME_FILENAME)


Freesound API Key: zr5kUObkkKtoSIiWGRPG6DPNUMOxdU1ercdOGcaJ
Files Directory: ../data/raw
Metadata CSV File: ../data/metadata/fonts_collection.csv


## Prepare the Data Directory (Code)

In [2]:
# Check if the directory to store downloaded sound files exists; if not, create it.
for folder in [config.RAW_DIR, config.PROCESSED_DIR, config.METADATA_DIR]:
    if not os.path.exists(folder):
        os.makedirs(folder)
        print(f"Created directory: {folder}")
    else:
        print(f"Directory {folder} already exists.")


Directory ../data/raw already exists.
Directory ../data/processed already exists.
Directory ../data/metadata already exists.


## Define Freesound Query Parameters

Adjust the queries to target the type of sound “fonts” you want. For example, “dog bark” might be used as a percussive element, while “vowels” can add a human quality. The filters ensure that only sounds with appropriate durations are retrieved.

In [3]:
# Define a list of queries to build our sound collection.
# Here we customize our search terms and filters based on our creative vision.
freesound_queries = [
    {
        'query': 'rain',
        'filter': 'duration:[5 TO 20]',  # Capture rain sounds between 5 and 20 seconds
        'num_results': 15,
    },
    {
        'query': 'forest ambience',
        'filter': 'duration:[10 TO 30]',  # Longer, immersive ambient recordings
        'num_results': 15,
    },
    {
        'query': 'wind',
        'filter': 'duration:[3 TO 15]',  # Wind noise can vary in length
        'num_results': 15,
    },
]


# Display the queries for confirmation.
print("Freesound Queries:")
for q in freesound_queries:
    print(q)


Freesound Queries:
{'query': 'rain', 'filter': 'duration:[5 TO 20]', 'num_results': 15}
{'query': 'forest ambience', 'filter': 'duration:[10 TO 30]', 'num_results': 15}
{'query': 'wind', 'filter': 'duration:[3 TO 15]', 'num_results': 15}


## Query Freesound and Collect Sound Objects 

This cell uses our query_freesound function to perform the API searches. The results from each query are concatenated into one list representing our overall collection.

In [4]:
# Initialize an empty list to store all retrieved sound objects.
all_sounds = []

# Loop through each query configuration and perform the Freesound search.
for query_info in freesound_queries:
    results = query_freesound(query_info['query'], query_info['filter'], query_info['num_results'])
    print(f"Retrieved {len(results)} sounds for query '{query_info['query']}'.")
    all_sounds.extend(results)

print(f"Total sounds retrieved: {len(all_sounds)}")


Retrieved 15 sounds for query 'rain'.
Retrieved 15 sounds for query 'forest ambience'.
Retrieved 15 sounds for query 'wind'.
Total sounds retrieved: 45


## Download Sound Previews

In [5]:
# Loop through each sound and download its preview to our designated directory.
for idx, sound in enumerate(all_sounds):
    print(f"Downloading sound {idx+1}/{len(all_sounds)}: id {sound.id}")
    download_sound_preview(sound, config.RAW_DIR)


Downloading sound 1/45: id 86351
Downloading sound 2/45: id 695793
Downloading sound 3/45: id 486267
Downloading sound 4/45: id 334148
Downloading sound 5/45: id 277749
Downloading sound 6/45: id 50053
Downloading sound 7/45: id 337791
Downloading sound 8/45: id 34066
Downloading sound 9/45: id 787958
Downloading sound 10/45: id 251631
Downloading sound 11/45: id 387064
Downloading sound 12/45: id 464283
Downloading sound 13/45: id 513394
Downloading sound 14/45: id 435665
Downloading sound 15/45: id 316895
Downloading sound 16/45: id 662090
Downloading sound 17/45: id 464477
Downloading sound 18/45: id 619325
Downloading sound 19/45: id 560287
Downloading sound 20/45: id 642763
Downloading sound 21/45: id 632754
Downloading sound 22/45: id 698356
Downloading sound 23/45: id 632346
Downloading sound 24/45: id 634226
Downloading sound 25/45: id 410526
Downloading sound 26/45: id 348830
Downloading sound 27/45: id 509176
Downloading sound 28/45: id 528353
Downloading sound 29/45: id 4051

## Create Metadata Records and Save DataFrame

The make_metadata_record function extracts key details from each sound, and we then store the entire collection in a CSV file. This DataFrame will serve as the basis for further analysis and later stages of the project.

In [6]:
# Build a list of metadata records for each sound using our utility function.
metadata_records = [make_metadata_record(sound, config.RAW_DIR) for sound in all_sounds]

# Create a Pandas DataFrame from the metadata records.
df_metadata = pd.DataFrame(metadata_records)

# Save the DataFrame to CSV for later use.
df_metadata.to_csv(config.DATAFRAME_FILENAME, index=False)
print(f"Saved metadata DataFrame with {len(df_metadata)} entries to {config.DATAFRAME_FILENAME}.")


Saved metadata DataFrame with 45 entries to ../data/metadata/fonts_collection.csv.


## Display the Metadata DataFrame

Displaying the DataFrame allows you to visually inspect the metadata and confirm that all information (e.g., sound names, tags, file paths) has been captured correctly.

In [7]:
# Load the metadata DataFrame to verify its contents.
df_loaded = pd.read_csv(config.DATAFRAME_FILENAME)
display(df_loaded)


Unnamed: 0,name,username,license,tags,freesound_id,path
0,RAIN STICK A 005.wav,sandyrb,https://creativecommons.org/licenses/by/4.0/,"['chile', 'chilean', 'ghana', 'ghanaian', 'ins...",86351,../data/raw/86351_14771-hq.ogg
1,Raining in my house.m4a,ManDaKi,http://creativecommons.org/publicdomain/zero/1.0/,"['raindrops', 'Raining', 'rain', 'raining']",695793,../data/raw/695793_15096086-hq.ogg
2,R23-34-Raining on Ground.wav,craigsmith,http://creativecommons.org/publicdomain/zero/1.0/,"['Vintage', 'Optical', 'Rain', 'Nature']",486267,../data/raw/486267_2524442-hq.ogg
3,Rain-Light-Loopable.wav,svampen,http://creativecommons.org/licenses/by/3.0/,"['weather', 'loop', 'rain', 'field-recording']",334148,../data/raw/334148_5910095-hq.ogg
4,Rain_Window_int.wav,alexkandrell,http://creativecommons.org/licenses/by/3.0/,"['Window', 'internal', 'Rain']",277749,../data/raw/277749_1363668-hq.ogg
5,rain_session_thunder1_2006.wav,vibe_crc,http://creativecommons.org/publicdomain/zero/1.0/,"['nature', 'rain', 'storm', 'thunder']",50053,../data/raw/50053_333536-hq.ogg
6,city pouring rain.wav,miradeshazer,https://creativecommons.org/licenses/by/4.0/,"['city', 'umbrella', 'field-recording', 'downp...",337791,../data/raw/337791_5994209-hq.ogg
7,AMBIENT - Rain - Light - Near Drainpipe (LOOP)...,Arctura,http://creativecommons.org/licenses/by/3.0/,"['deep', 'drainpipe', 'droplets', 'field-recor...",34066,../data/raw/34066_28216-hq.ogg
8,WATRDran-EXT_RAIN-Stream Of Water From Metal D...,YouMightLikeThis,http://creativecommons.org/publicdomain/zero/1.0/,"['field-recording', 'raindrops', 'rain', 'meta...",787958,../data/raw/787958_16787921-hq.ogg
9,NYC Rain 4 B rumble,bmlake,https://creativecommons.org/licenses/by/4.0/,"['NYC', 'Weather', 'Rain', 'Thunder']",251631,../data/raw/251631_4040997-hq.ogg
