# 1. Introduction: Can You Hear the Internet?
Have you ever wondered what your internet usage sounds like? We're used to seeing data in charts and graphs, but what if we could represent it with music? This was the question that sparked the Network Communications Project.

In this article, I'll walk you through how I took a raw dataset of my university's weekly network traffic and transformed it into a musical piece. The goal was to "sonify" the download and upload statistics, creating a unique auditory representation of data.

We'll use Python's Pandas library for the data manipulation and a fascinating online tool called [TwoTone MIDI Out Beta](https://twotone-midiout-beta.netlify.app/)  to generate the final musical output. Let's get started!

# 2. The Raw Material: Sourcing the Network Data

Every data story starts with the data. For this project, the source was the Janet Netsight Portal, which tracks network traffic for UK educational institutions.

**The Goal:** To get a historical view of the "In" (download) and "Out" (upload) traffic for my University.  
**The Challenge:** The portal doesn't allow for a simple "download all" button. Through experimentation, I found that requesting a date range of approximately 549 days (e.g., from July 2023 to January 2025) provided the daily data I needed.  
**A Quick Note on Perspective:** The data is labeled from the provider's (Janet's) perspective. This means "In" is data coming into their network from the university (our upload), and "Out" is data going out to the university (our download). We'll need to remember to swap these later!

# 3. The Setup: Loading and Cleaning the Data  

With several CSV files downloaded, the first step is to load them into a single, clean DataFrame using Pandas. We'll use Python's glob library to find all the CSV files in our input directory.

## Code Block 1: Imports and File Loading  
First, let's import our libraries and set up the path to our data files.

In [15]:
# Import necessary libraries
import numpy as np
import pandas as pd
import glob

# Define the path to the input folder and get a list of all CSV files
path = rf'./input/*.csv'
files = glob.glob(path)

Now, we'll loop through each file, read it into a Pandas DataFrame, and combine them. A small trick here is to use a header_flag to ensure we only include the header row from the very first file, creating a clean, unified dataset.

In [16]:
# List to hold each DataFrame
df_list = []
header_flag = False

for file in files:
    df = pd.read_csv(file)

    # Only keep the header from the first file
    if not header_flag:
        header_flag = True
    else:
        # For subsequent files, skip the header row
        df = df.iloc[1:]

    df_list.append(df)

# Concatenate all DataFrames and sort by time
final_df = pd.concat(df_list, ignore_index=True)
df = final_df.sort_values('Time', ascending=True).copy()

# As mentioned, Janet's 'out' is our download ('in') and vice-versa.
# Let's rename the columns to reflect our perspective.
df.columns = ['Time', 'in', 'out']

# 4: Feature Engineering - Building a Richer Musical Palette

Raw data is just the beginning. To transform our data into a piece with rhythm, texture, and distinct movements, we need to create more features. Think of these new data columns as potential instruments or triggers in our final musical score.

The goal is to create flags that mark specific points in time, such as the start of a week, a month, or a quarter. We can also add context, like whether a given day is a workday or a weekend.

Here are the features we'll add to our daily data:

* Week_Start: A flag for Monday to mark the start of a new week.
* Month_Start: A flag for the first day of the month.
* Year_Start: A flag for the first day of the year.
* Week_Number: The week number of the year (1-52/53).

Let's generate these using the  datetime functionality built into Pandas.

## Code Block 2: Creating Time-Based Features

First, we ensure the 'Time' column is in the correct datetime format. Then, we use the .dt accessor to pull out all the information we need. We use np.where to create our flags: if a condition is true (e.g., the day is a Monday), we assign it a value of 256; otherwise, it's 0. This high value creates a clear signal for our sonification tool.

In [17]:
# Convert the 'Time' column to a proper datetime format
df['Time'] = pd.to_datetime(df['Time'], format='mixed')

# Extract time-based features that we can use for musical cues
df['Week_Start'] = np.where(df['Time'].dt.day_name() == 'Monday', 256, 0)
df['Month_Start'] = np.where(df['Time'].dt.is_month_start, 256, 0)
df['Year_Start'] = np.where(df['Time'].dt.is_year_start, 256, 0)

# We'll also extract the week number and year for grouping
df['Week_Number'] = df['Time'].dt.isocalendar().week
df['Year'] = df['Time'].dt.year

# 5. From Daily Noise to a Weekly Melody
Daily data can be noisy. To create a smoother, more melodic output, I decided to aggregate the data by week, taking the average (mean) traffic for each week.

## Code Block 3: Grouping by Week

In [11]:
# Group the data by Year and Week Number, then calculate the weekly mean for 'in' and 'out' traffic
weekly_df = df.groupby(['Year', 'Week_Number']).agg(
    in_mean=('in', 'mean'),
    out_mean=('out', 'mean'),
    Year_Start=('Year_Start', 'max'), # Carry over our markers
    Month_Start=('Month_Start', 'max')
).reset_index()

# 6. The Core Challenge: Proportional Sonification  

Here we face the most interesting challenge. Download (in) traffic is often much larger than upload (out) traffic. If we map each to a musical scale independently, the highest upload value and the highest download value would both play the same top note. This would completely hide the fact that downloads are significantly higher.

To solve this, we need to maintain the proportional relationship between the two.

**The Solution:**

1. Find the maximum value for both in and out traffic across the entire dataset.
2. Calculate a scaling_factor (out_max / in_max).
3. Quantize both data streams into a set number of bins (e.g., 48, which maps nicely to musical notes).
4. Multiply the quantized upload data by the scaling_factor. This scales it down, ensuring its musical representation is proportionally lower than the download data, just like the real traffic.

## Code Block 4: Quantizing and Scaling

In [12]:
# 1. Calculate the scaling factor
in_max = df['in'].max()
out_max = df['out'].max()
scaling_factor = out_max / in_max

# 3. Quantize the weekly average 'in' data into 48 bins
weekly_df['in_quant'] = pd.cut(weekly_df['in_mean'], bins=48, labels=range(1, 49)).astype(int)

# 4. Quantize the 'out' data and then apply the scaling factor
out_quant_scaled = (pd.cut(weekly_df['out_mean'], bins=48, labels=range(1, 49)).astype(float) * scaling_factor)
weekly_df['out_quant'] = out_quant_scaled.round().astype(int).clip(lower=1) # a value of 0 would be silence

# 7. Adding Dynamics: Representing Up and Down Trends
To make the music more dynamic, I wanted to represent not just the volume of traffic, but also its direction. Is the traffic increasing or decreasing week-on-week?

We can create two new data streams for each of our in and out values:

* An _up column that only contains a value if the traffic increased from the previous week.
* A _down column that only contains a value if the traffic decreased.
In the TwoTone tool, we can then assign different arpeggio styles (e.g., ascending for "up", descending for "down") to these streams, creating a richer sonic texture.

## Code Block 5: Capturing Trends

In [13]:
# Create columns to capture increasing trends
weekly_df['in_up'] = weekly_df['in_quant'].where(weekly_df['in_quant'] > weekly_df['in_quant'].shift(), 0)
weekly_df['out_up'] = weekly_df['out_quant'].where(weekly_df['out_quant'] > weekly_df['out_quant'].shift(), 0)

# Create columns to capture decreasing trends
weekly_df['in_down'] = weekly_df['in_quant'].where(weekly_df['in_quant'] < weekly_df['in_quant'].shift(), 0)
weekly_df['out_down'] = weekly_df['out_quant'].where(weekly_df['out_quant'] < weekly_df['out_quant'].shift(), 0)

# Display the first few rows of our final DataFrame
# On Medium, you can paste a screenshot of the output or just show the .head()
weekly_df.head(5)


Unnamed: 0,Year,Week_Number,in_mean,out_mean,Year_Start,Month_Start,in_quant,out_quant,in_up,out_up,in_down,out_down
0,2019,1,55040090.0,544081100.0,256,256,2,14,0,0,0,0
1,2019,2,128356000.0,1548700000.0,0,0,5,49,5,49,0,0
2,2019,3,134956600.0,1562483000.0,0,0,5,52,0,52,0,0
3,2019,4,154851000.0,1763358000.0,0,0,6,57,6,57,0,0
4,2019,5,170757300.0,1828013000.0,0,256,7,60,7,60,0,0


In [18]:
weekly_df.to_csv("./output/Processed_traffic.csv")

# 8. The Final Composition: Making Music with TwoTone  

With our data prepared, the final step is to bring it to life. I used the [TwoTone MIDI Out Beta](https://twotone-midiout-beta.netlify.app/) . You simply upload the final CSV, map your data columns to musical parameters, and press play.  
Here are the settings I found worked best to create a clear and pleasant composition from the weekly data:

**Download Data (in_quant)**

* Instrument: Double Bass
* Range: 2 Octaves
* Speed: 4x
* Style: Arpeggio (4 notes, ascending)

**Upload Data (out_quant)**

* Instrument: Glockenspiel
* Range: 2 Octaves
* Speed: 2x
* Style: Ascending

The low, heavy notes of the double bass represent the high volume of download traffic, while the lighter, twinkling glockenspiel represents the lower volume of upload traffic. The time markers for month and year starts were used as cues to add speech annotations (e.g., "2024," "February") using an external audio editor for the final video.

# 9. Conclusion
This project was a fascinating journey into a new, for me, form of data representation. It demonstrates that with a bit of creativity and some data manipulation skills, we can find stories in data that go beyond traditional charts. Sonification offers a uniquely emotional and intuitive way to experience data patterns over time.

The final CSV file is ready for sonification. I encourage you to download the notebook from my GitHub [link to your GitHub repo here], experiment with the data, and create your own data-driven music with the [TwoTone MIDI Out Beta](https://twotone-midiout-beta.netlify.app/)!