---
layout: post 
title: Lists and Filtering Algorithms 
description: Lists and Filtering Algorithms
courses: {csp: {week: 1} }
comments: true 
sticky_rank: 1 
---

## Popcorn Hack #1

In [4]:
# Create a list of favorite movies
movies = ["The Boy and the Heron", "Grave of Fireflies", "Howl's Moving Castle", "Spirited Away"]

# Replace the second movie
movies[1] = "Whiplash"

# Add another movie to the list
movies.append("Finding Nemo")

# Display the updated list
print(movies)


['The Boy and the Heron', 'Whiplash', "Howl's Moving Castle", 'Spirited Away', 'Finding Nemo']


## Popcorn Hack 2

In [9]:
ages = [15, 20, 34, 16, 18, 21, 14, 19]
ages.sort()
eligible_ages = [age for age in ages if age >= 18]
print(eligible_ages)


[18, 19, 20, 21, 34]


## Homework #1

Video: 8 MInutes: Python list comprehensions are a shorthand way of creating new lists using a single line of code, as opposed to running a full for loop. They come in handy when making code succinct and readable, especially when altering or filtering elements in a collection. The overall syntax is [new_item for item in iterable], and you can add an if at the end to filter out elements or an if-else inside to transform values depending on a condition. List comprehensions also support nested loops for generating combinations, and there are similar structures to use with sets and dictionaries. Comprehensions for sets employ {} and remove duplicates by themselves, while dictionary comprehensions employ {key: value for.} and even support including conditions. However, if a comprehension becomes overly lengthy or convoluted, it's easier to simply utilize the regular loop due to readability considerations.

Video: 5 Minutes: Lists in Python are native data structures to store and organize data in a predetermined, ordered manner. They are either created by the list() constructor or more commonly by square brackets [], and they can be initialized with values or augmented later by methods like .append(). Lists are not the same as sets since they preserve the initial order of elements as well as accept duplicates. Each item in a list is numbered starting with 0, and Python also supports a negative indexing feature, where -1 is the last item, -2 is the second last, and so forth. Access of individual elements using such indexes or retrieving a range of elements using slicing by specifying the start index and excluding the end index is achievable. Python lists are also dynamic in the way that they can hold different types of data—integers, strings, floats, booleans, and even lists—under a single list. Python is different from other programming languages, which restrict lists to one type of data. Python offers mixed-type lists. Lists can be concatenated using the + operator, but the way in which lists are concatenated affects the result. Though concatenation creates a new list, the original lists are left unchanged. Python provides many built-in list operations to manipulate lists, like reversing, sorting, clearing, etc., which can be found with dir() and mastered with the help() function. Overall, lists are powerful, versatile data organization and manipulation tools in Python.


## Homework #2

In [11]:
# Homework Hack #2


original_list = list(range(1, 31))


filtered_list = [num for num in original_list if num % 3 == 0 and num % 5 != 0]


print("Original List:")
print(original_list)

print("\nFiltered List (Divisible by 3 but not by 5):")
print(filtered_list)


Original List:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]

Filtered List (Divisible by 3 but not by 5):
[3, 6, 9, 12, 18, 21, 24, 27]


In [12]:
# Homework Hack #3 - filter_spotify_data function (updated for your CSV)

import pandas as pd

def filter_spotify_data(file_path):
    """
    Reads the Spotify Global Streaming Data 2024 CSV,
    filters for songs with over 10 million total streams,
    and displays the result clearly.
    """
    try:
        # Load the data
        df = pd.read_csv(file_path)

        # Convert to numeric in case of formatting issues
        df['Total Streams (Millions)'] = pd.to_numeric(df['Total Streams (Millions)'], errors='coerce')

        # Filter for songs with more than 10 million streams
        filtered_df = df[df['Total Streams (Millions)'] > 10]

        # Display results
        if not filtered_df.empty:
            print("🎧 Songs with Over 10 Million Streams:\n")
            display(filtered_df[['Artist', 'Album', 'Total Streams (Millions)']])
        else:
            print("No songs found with more than 10 million streams.")

        return filtered_df

    except FileNotFoundError:
        print(f"❌ File not found: '{file_path}'")
    except Exception as e:
        print(f"❌ An error occurred: {e}")


In [14]:

filter_spotify_data("Spotify_2024_Global_Streaming_Data.csv")


🎧 Songs with Over 10 Million Streams:



Unnamed: 0,Artist,Album,Total Streams (Millions)
0,Taylor Swift,1989 (Taylor's Version),3695.53
1,The Weeknd,After Hours,2828.16
2,Post Malone,Austin,1425.46
3,Ed Sheeran,Autumn Variations,2704.33
4,Ed Sheeran,Autumn Variations,3323.25
...,...,...,...
495,Karol G,MAÑANA SERÁ BONITO,2947.97
496,Dua Lipa,Future Nostalgia,4418.61
497,Karol G,MAÑANA SERÁ BONITO,2642.90
498,SZA,SOS,4320.23


Unnamed: 0,Country,Artist,Album,Genre,Release Year,Monthly Listeners (Millions),Total Streams (Millions),Total Hours Streamed (Millions),Avg Stream Duration (Min),Platform Type,Streams Last 30 Days (Millions),Skip Rate (%)
0,Germany,Taylor Swift,1989 (Taylor's Version),K-pop,2019,23.10,3695.53,14240.35,4.28,Free,118.51,2.24
1,Brazil,The Weeknd,After Hours,R&B,2022,60.60,2828.16,11120.44,3.90,Premium,44.87,23.98
2,United States,Post Malone,Austin,Reggaeton,2023,42.84,1425.46,4177.49,4.03,Free,19.46,4.77
3,Italy,Ed Sheeran,Autumn Variations,K-pop,2018,73.24,2704.33,12024.08,3.26,Premium,166.05,25.12
4,Italy,Ed Sheeran,Autumn Variations,R&B,2023,7.89,3323.25,13446.32,4.47,Free,173.43,15.82
...,...,...,...,...,...,...,...,...,...,...,...,...
495,Brazil,Karol G,MAÑANA SERÁ BONITO,Jazz,2018,18.80,2947.97,12642.83,3.59,Premium,83.30,18.58
496,Canada,Dua Lipa,Future Nostalgia,Classical,2023,89.68,4418.61,11843.46,3.15,Free,143.96,5.82
497,Germany,Karol G,MAÑANA SERÁ BONITO,Rock,2023,36.93,2642.90,8637.46,4.08,Free,76.36,15.84
498,Canada,SZA,SOS,Indie,2022,87.26,4320.23,12201.40,2.79,Free,84.50,13.07


## Review: 

Python lists are a fundamental data type used to store ordered collections of items. Lists are declared with square brackets ([]) and can consist of elements of any data type, including numbers, strings, booleans, and even lists. Lists are changeable because their content can be altered after they have been declared. You can introduce elements using.append() or.insert(), remove elements using.remove() or.pop(), and read or modify elements by index. List slicing is also implemented by Python, i.e., you can extract a subset of elements with a start:stop. You can even concatenate lists with the + operator, repeat with *, and reverse or sort them with built-in methods.

A practical example of how a filtering algorithm can be used is in an email client that filters out spam. When a user gets a new message, the filtering algorithm checks it against criteria like keywords, sender address, and user history. If the message crosses some thresholds, it's labeled as spam and automatically sent to a different folder. This keeps the inbox uncluttered and shields users from phishing and unwanted ads.

Efficiency analysis of filtering algorithms plays a very critical role in software development because it directly impacts performance, scalability, and user experience. Efficient algorithms make applications interactive and responsive despite increasing data quantities. In large systems, such as social network sites or e-commerce platforms, algorithm inefficiency can lead to slow processing time, high memory usage, or even system failure. By examining algorithm complexity and reducing code, programmers are able to design applications that are both functional and stable.


