🧩 Data Analytics and Visualization
# Data Structures in Python: Lists, Tuples, Dictionaries, DataFrames


##🛒 2.1 Lists
Lists are like a well-organized shopping cart: you can add items, remove them, and check what's inside. They keep your data in an ordered, mutable collection.

Real-World Example: Imagine a shopping cart that you can modify as you shop.

Creating and Accessing Lists

In [1]:
# Creating a shopping cart
shopping_cart = ["Milk", "Bread", "Eggs"]
shopping_cart


['Milk', 'Bread', 'Eggs']

In [3]:
# Accessing the first item in the cart
print(shopping_cart[0])  # Output: Milk

Milk


#List Operations


In [9]:

# Adding a new item to the cart
#append():add new item in a list
shopping_cart.append("Butter")
print(shopping_cart)  # Output: ['Milk', 'Bread', 'Eggs', 'Butter']

['Milk', 'Bread', 'Eggs', 'Butter']


In [17]:

#extend(): When you buy multiple items at once.
# Adding multiple items to the cart
shopping_cart.extend(["Cheese", "Yogurt"])
print(shopping_cart)  # Output: ['Milk', 'Bread', 'Eggs', 'Butter', 'Cheese', 'Yogurt']


['Milk', 'Orange Juice', 'Bread', 'Eggs', 'Butter', 'Cheese', 'Yogurt', 'Cheese', 'Yogurt', 'Cheese', 'Yogurt']


In [13]:
#insert(): Placing an item at a specific spot in your cart.
# Inserting an item at the second position
shopping_cart.insert(1, "Orange Juice")
print(shopping_cart)  # Output: ['Milk', 'Orange Juice', 'Bread', 'Eggs', 'Butter', 'Cheese', 'Yogurt']


['Milk', 'Orange Juice', 'Bread', 'Eggs', 'Butter', 'Cheese', 'Yogurt']


In [19]:
#remove(): When you decide to remove an item from your cart.

# Removing an item from the cart
shopping_cart.remove("Eggs")
print(shopping_cart)  # Output: ['Milk', 'Orange Juice', 'Bread', 'Butter', 'Cheese', 'Yogurt']


['Milk', 'Orange Juice', 'Bread', 'Butter', 'Cheese', 'Yogurt', 'Cheese', 'Yogurt', 'Cheese', 'Yogurt']


In [21]:
#pop(): Checking out the last item in your cart.
# Removing the last item from the cart
last_item = shopping_cart.pop()
print(last_item)  # Output: Yogurt
print(shopping_cart)  # Output: ['Milk', 'Orange Juice', 'Bread', 'Butter', 'Cheese']


Yogurt
['Milk', 'Orange Juice', 'Bread', 'Butter', 'Cheese', 'Yogurt', 'Cheese', 'Yogurt', 'Cheese']


In [23]:
#reverse(): Flipping your cart upside down!
# Reversing the order of items
shopping_cart.reverse()
print(shopping_cart)  # Output: ['Cheese', 'Butter', 'Bread', 'Orange Juice', 'Milk']


['Cheese', 'Yogurt', 'Cheese', 'Yogurt', 'Cheese', 'Butter', 'Bread', 'Orange Juice', 'Milk']


In [27]:
#len(): Counting how many items are left.

# Counting the number of items
num_items = len(shopping_cart)
print(num_items)  # Output: 9

9


In [29]:


#min(): Finding the smallest item (alphabetically).
# Finding the smallest item alphabetically
smallest_item = min(shopping_cart)
print(smallest_item)  # Output: Bread

#max(): Finding the largest item (alphabetically).
# Finding the largest item alphabetically
largest_item = max(shopping_cart)
print(largest_item)  # Output: Orange Juice

#List Comprehensions
# Creating a list of the lengths of item names
item_lengths = [len(item) for item in shopping_cart]
print(item_lengths)  # Output: [6, 6, 5, 12, 4]

Bread
Yogurt
[6, 6, 6, 6, 6, 6, 5, 12, 4]


#🌐 2.2 Tuples
Tuples are like a fixed set of coordinates on a map—once set, they don’t change. They are immutable, which means once you create them, they stay that way.

Real-World Example: A tuple representing the coordinates of a treasure location.

Creating and Accessing Tuples

In [31]:

# Creating a tuple with treasure coordinates
treasure_location = (40.7128, -74.0060)

# Accessing the latitude
print(treasure_location[0])  # Output: 40.7128

# Tuples are immutable
# treasure_location[0] = 41.0000  # This will raise a TypeError


40.7128


🗂️ 2.3 Dictionaries
Dictionaries are like a well-organized address book—each entry has a unique key (like a name) and a value (like a phone number).

Real-World Example: A dictionary storing a contact’s information.

Creating and Accessing Dictionaries



In [43]:

# Creating a contact dictionary
contact_info = {
    "Name": "John Doe",
    "Phone": "123-456-7890",
    "Email": "john.doe@example.com"
}

# Accessing the contact dictionary
print(contact_info)

{'Name': 'John Doe', 'Phone': '123-456-7890', 'Email': 'john.doe@example.com'}

In [33]:

# Creating a contact dictionary
contact_info = {
    "Name": "John Doe",
    "Phone": "123-456-7890",
    "Email": "john.doe@example.com"
}

# Accessing the phone number
print(contact_info["Phone"])  # Output: 123-456-7890


123-456-7890


In [35]:
#Dictionary Operations
# Updating and adding details
contact_info["Phone"] = "098-765-4321"  # Update the phone number
contact_info["Address"] = "456 Elm St"  # Add a new key-value pair
print(contact_info)  # Output: {'Name': 'John Doe', 'Phone': '098-765-4321', 'Email': 'john.doe@example.com', 'Address': '456 Elm St'}


{'Name': 'John Doe', 'Phone': '098-765-4321', 'Email': 'john.doe@example.com', 'Address': '456 Elm St'}


In [37]:
# Removing a detail
del contact_info["Address"]
print(contact_info)  # Output: {'Name': 'John Doe', 'Phone': '098-765-4321', 'Email': 'john.doe@example.com'}


{'Name': 'John Doe', 'Phone': '098-765-4321', 'Email': 'john.doe@example.com'}


In [39]:
# Iterating through the dictionary
for key, value in contact_info.items():
    print(f"{key}: {value}")

# Output:
# Name: John Doe
# Phone: 098-765-4321
# Email: john.doe@example.com

Name: John Doe
Phone: 098-765-4321
Email: john.doe@example.com


In [47]:
import pandas as pd

In [53]:

##📊 2.4 DataFrames
#DataFrames are like spreadsheets in Python, perfect for organizing and analyzing tabular data.

#Real-World Example: A DataFrame representing your personal fitness tracker’s log.


# Importing the Pandas library
import pandas as pd
#Creating DataFrames from Lists and Dictionaries

# Creating a DataFrame from a fitness log
fitness_log = pd.DataFrame({
    "Date": ["2024-01-01", "2024-01-02", "2024-01-03"],
    "Exercise": ["Running", "Cycling", "Swimming"],
    "Minutes": [30, 45, 60]
})
fitness_log

Unnamed: 0,Date,Exercise,Minutes
0,2024-01-01,Running,30
1,2024-01-02,Cycling,45
2,2024-01-03,Swimming,60


In [55]:
#Accessing and Modifying Data in a DataFrame
# Accessing the exercise column
fitness_log["Exercise"]  # Output: 0 Running 1 Cycling 2 Swimming Name: Exercise, dtype: object


0     Running
1     Cycling
2    Swimming
Name: Exercise, dtype: object

In [59]:

# Adding a new column with calories burned
fitness_log["Calories"] = [300, 400, 500]
fitness_log

# Output:
#          Date  Exercise  Minutes  Calories
# 0 2024-01-01   Running       30       300
# 1 2024-01-02   Cycling       45       400
# 2 2024-01-03  Swimming       60       500

Unnamed: 0,Date,Exercise,Minutes,Calories
0,2024-01-01,Running,30,300
1,2024-01-02,Cycling,45,400
2,2024-01-03,Swimming,60,500


##❓ 2.5 Practice Questions
Question 1: Create a DataFrame student_scores with scores in Math, Science, and English. Perform these tasks:

Calculate the average score for each student and add it as a new column named "Average".
Find the student(s) with the highest average score.
Sort the DataFrame based on the average score in descending order.

In [81]:

# Creating a DataFrame with student scores
student_scores=pd.DataFrame({
    'Students':['Malisa','Akoth','Awino'],
    'Math':[70,80,10],
    'Science':[30,50,60],
    'English':[50,40,10]
})
student_scores

Unnamed: 0,Students,Math,Science,English
0,Malisa,70,30,50
1,Akoth,80,50,40
2,Awino,10,60,10


In [91]:

# Calculating the average score
student_scores["Average"] = student_scores[["Math", "Science", "English"]].mean(axis=1)
student_scores

Unnamed: 0,Students,Math,Science,English,Average
0,Malisa,70,30,50,50.0
1,Akoth,80,50,40,56.666667
2,Awino,10,60,10,26.666667


In [95]:
# Finding the student with the highest average score
top_student = student_scores.loc[student_scores["Average"].idxmax()]
top_student

Students        Akoth
Math               80
Science            50
English            40
Average     56.666667
Name: 1, dtype: object

In [97]:
# Sorting the DataFrame by average score
sorted_scores = student_scores.sort_values(by="Average", ascending=False)

print("Top Student:\n", top_student)
print("\nSorted Scores:\n", sorted_scores)

Top Student:
 Students        Akoth
Math               80
Science            50
English            40
Average     56.666667
Name: 1, dtype: object

Sorted Scores:
   Students  Math  Science  English    Average
1    Akoth    80       50       40  56.666667
0   Malisa    70       30       50  50.000000
2    Awino    10       60       10  26.666667


In [99]:

#Question 2: Convert a list of temperatures from Celsius to Fahrenheit using list comprehension.
# List of temperatures in Celsius
celsius_temperatures = [0, 20, 37, 100]

# Convert to Fahrenheit
fahrenheit_temperatures = [(temp * 9/5) + 32 for temp in celsius_temperatures]
print(fahrenheit_temperatures)  # Output: [32.0, 68.0, 98.6, 212.


[32.0, 68.0, 98.6, 212.0]
