# Working with Data in Python

Concepts:
* Understanding and using lists, dictionaries, and tuples
* Reading and writing files
* Introduction to the Pandas library

## Lists

A list is a collection of items that are stored in a specific order (Index number). Their properties are:
* They are ordered, meaning that the items have a defined order, and you can access them by their index.
* They are mutable, meaning that you can change the items after they have been created.
* They can contain any type of data, including other lists.

In [None]:
# Creating a list
my_list = [] # empty list
print(my_list)
my_list = [1, 2, 3, 4, "Hello", 3.14] # list with elements
print(my_list)

In [None]:
# Second list
second_list = [1, 2, 3, "Strings", "Floats(not)",["Another", "List", 1, 2]]

print(second_list)

print(second_list[3]) # Element 4
print(second_list[0]) # Element 1

In [None]:
# Accessing elements
print(my_list[0]) # 1
print(my_list[4]) # Hello
print(my_list[-1]) # 3.14 Start from the end
print(my_list[-3]) # 4

In [None]:
# Adding elements
my_list.append(10)
print(my_list)
my_list.insert(2, 20) # insert after 2nd element, 
print(my_list)
my_list.insert(1, 30)
print(my_list)
my_list.extend([5, 6, 7])
print(my_list)

In [None]:
# Removing elements
my_list.remove(20) # give actual value you want to remove
print(my_list)
my_list.pop() # Remove the value (at the end)
print(my_list) 
my_list.pop(2) # Remove the value (with index 2)
print(my_list)
del my_list[1] # Delete function, with index number, no recommend. 
print(my_list)

In [None]:
# Slicing
print(my_list[1:3]) # extract the element from the index 1 to the index 2 (Stop at index 3)
print(my_list[1:])
print(my_list[:3])
print(my_list[::2]) # grab every nth data from the beginnng
print(my_list[::-1]) # Reverse order

In [None]:
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9]

In [None]:
# List Functions
print(len(my_list))
print(max(my_list))
print(min(my_list))
print(sum(my_list))

In [None]:
# Looping through a list
# Print element in the list 
for element in my_list:
    print(element)

print("---")
# See the index that you are working with
for index, element in enumerate(my_list):
    print(index, element)

In [None]:
# Exercise
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
## longer list
#my_list = range[0, 100, 2]
# Creat the list of even numbers
even_list = my_list[1::2]
print(even_list)
print("------")
# Within the even list, Find which numbers are divisible by 3
for numbers in even_list:
    if numbers % 3 == 0:
        print(numbers)
print("------")
# Using with names
name_list = ["Avery", "Won", "Stewart", "Lumine", "Tom", "John"]
for name in name_list:
    if "e" in name:
        print(name)

In [None]:
# List Comprehension
squared_list = [x**2 for x in my_list]
print(squared_list)

## Dictionaries

A dictionary is a collection of key-value pairs. Their properties are:
* They are unordered, meaning that the items do not have a defined order, and you cannot access them by their index.
* They are mutable, meaning that you can change the items after they have been created.
* They can contain any type of data, including other dictionaries.

In [None]:
# Creating a dictionary
my_dict = {} # empty dictionary
print(my_dict)
my_dict = {"name": "John", "age": 25, "city": "New York", 3: 3.14159} # dictionary with elements
print(my_dict)
print(my_dict["city"])
print(my_dict[3])

In [None]:
# Accessing elements
print(my_dict["name"]) # John
print(my_dict["age"]) # 25
print(my_dict.get("city")) # New York
print(my_dict.get("country")) # None (Does not exist) : Does not make any crashes.

In [None]:
# Adding elements
my_dict["country"] = "USA"
print(my_dict)
my_dict.update({"state": "NY", "zip": 10001})
print(my_dict)

In [None]:
# Removing elements
my_dict.pop("zip")
print(my_dict)
del my_dict["state"]
print(my_dict)

In [None]:
# Dictionary Functions
print(len(my_dict)) # How many key pairs are
print(my_dict.keys())
print(my_dict.values())
print(my_dict.items())

In [None]:
# Looping through a dictionary
for key in my_dict:
    print(key, my_dict[key])
print("---")
for key, value in my_dict.items():
    print(key, value)

In [None]:
# Dictionary Comprehension
squared_dict = {x: x**2 for x in my_list}
print(squared_dict)

## Tuples

A tuple is a collection of items that are stored in a specific order. Their properties are:
* They are ordered, meaning that the items have a defined order, and you can access them by their index.
* They are immutable, meaning that you cannot change the items after they have been created.
* They can contain any type of data, including other tuples.

In [None]:
# Creating a tuple
my_tuple = () # empty tuple
print(my_tuple)
my_tuple = (1, 2, 3, 4, "Hello", 3.14) # tuple with elements
print(my_tuple)

In [None]:
# Accessing elements
print(my_tuple[0]) # 1
print(my_tuple[-1]) # 3.14 Start from the end
# Same as list

In [None]:
# Adding elements and removing elements
# Tuples are immutable, so you can't add elements or remove elements

In [None]:
my_tuple = (1, 2, 3, 4, 5, 6, 7, 8, 9)

In [None]:
# Tuple Functions
print(len(my_tuple))
print(max(my_tuple))
# Same as list

In [None]:
# Looping through a tuple
for element in my_tuple:
    print(element)
# Same as list

In [None]:
# Exercise 2
## Create the dictionaries with name, age
Prac_dict_list = [
    {"name" : "Won", "age": 30},
    {"name": "Avery", "age" : 25},
    {"name" : "Jimmy", "age": 35}
                  ]
# print(Prac_dict_list)
## Print out only the names with a certain age range
for person in Prac_dict_list:
    print(person)
    if person["age"] >= 30 and person["age"] <= 40:
        print(person["name"])

## Reading and Writing Files

Python has built-in functions for reading and writing files. The `open()` function is used to open a file, and the `read()` and `write()` methods are used to read and write data to the file.

### File Usage Format

```
with open(filename, mode) as file_variable:
    # code to read or write to the file
```
Modes:
* `r`: read mode
* `w`: write mode
* `a`: append mode
* `rb`: read binary mode
* `wb`: write binary mode

In [None]:
# Writing to a file
with open("file.txt", "w") as file:
    file.write("Hello, World!")

In [None]:
# Outputting multiple lines
lines = ["Hello", "World", "How", "Are", "You"]
with open("file.txt", "w") as file:
    for line in lines:
        file.write(line + "\n")

In [None]:
# Appending to a file
with open("file.txt", "a") as file:
    file.write("I am fine, thank you!")

In [98]:
# Reading from a file
with open("file.txt", "r") as file:
    content = file.read()
    print(content)

Hello
World
How
Are
You
I am fine, thank you!


In [99]:
# Reading from a file line by line
with open("file.txt", "r") as file:
    for line in file:
        print(line)

Hello

World

How

Are

You

I am fine, thank you!


In [100]:
# Reading from a file line by line and storing in a list
lines = []
with open("file.txt", "r") as file:
    for line in file:
        lines.append(line)
print(lines)

['Hello\n', 'World\n', 'How\n', 'Are\n', 'You\n', 'I am fine, thank you!']


In [101]:
# Write a csv file
import csv
data = [["Name", "Age"], ["John", 25], ["Alice", 30], ["Bob", 35]]
with open("data.csv", "w") as file:
    writer = csv.writer(file)
    writer.writerows(data)

# or 

with open("data.csv", "w") as file:
    for row in data:
        for item in row:
            file.write(str(item) + ",")
        file.write("\n")
        

In [102]:
# Read a csv file
with open("data.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)
print()
# or

with open("data.csv", "r") as file:
    for line in file:
        print(line.strip().split(","))

['Name', 'Age', '']
['John', '25', '']
['Alice', '30', '']
['Bob', '35', '']

['Name', 'Age', '']
['John', '25', '']
['Alice', '30', '']
['Bob', '35', '']


## Pandas

Pandas is a powerful data manipulation library for Python. It provides data structures and functions for working with structured data. The two main data structures in Pandas are the `Series` and `DataFrame`.

### Series

A `Series` is a one-dimensional array that can hold any type of data. It is similar to a list or a dictionary, but with additional functionality.

### DataFrame

A `DataFrame` is a two-dimensional array that can hold any type of data. It is similar to a table or a spreadsheet, but with additional functionality.

In [108]:
# Install pandas
# !: run it in the terminal
!pip install pandas 

Collecting pandas
  Downloading pandas-2.2.2-cp312-cp312-win_amd64.whl.metadata (19 kB)
Collecting numpy>=1.26.0 (from pandas)
  Downloading numpy-1.26.4-cp312-cp312-win_amd64.whl.metadata (61 kB)
     ---------------------------------------- 0.0/61.0 kB ? eta -:--:--
     ---------------------------------------- 0.0/61.0 kB ? eta -:--:--
     ------------ ------------------------- 20.5/61.0 kB 320.0 kB/s eta 0:00:01
     ------------------------- ------------ 41.0/61.0 kB 388.9 kB/s eta 0:00:01
     -------------------------------------- 61.0/61.0 kB 540.4 kB/s eta 0:00:00
Collecting pytz>=2020.1 (from pandas)
  Downloading pytz-2024.1-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Downloading tzdata-2024.1-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading pandas-2.2.2-cp312-cp312-win_amd64.whl (11.5 MB)
   ---------------------------------------- 0.0/11.5 MB ? eta -:--:--
    --------------------------------------- 0.2/11.5 MB 5.3 MB/s eta 0:00:03
  

In [109]:
import pandas as pd

# Creating a DataFrame
data = {
    "Name": ["John", "Jane", "Alice", "Bob"],
    "Age": [25, 30, 35, 40],
    "City": ["New York", "Los Angeles", "Chicago", "Houston"]
}
df = pd.DataFrame(data)
print(df)

    Name  Age         City
0   John   25     New York
1   Jane   30  Los Angeles
2  Alice   35      Chicago
3    Bob   40      Houston


In [None]:
# Accessing elements
print(df["Name"])
print()
print(df["Name"][0])
print()
print(df.iloc[0])
print()
print(df.Name.tolist())

In [None]:
# Adding elements
df["Country"] = ["USA", "USA", "USA", "USA"]
print(df)

In [None]:
# Removing elements
df.drop("Country", axis=1, inplace=True)
# axis=1 for column, axis=0 for row
# inplace=True to modify the original DataFrame
print(df)

In [None]:
# DataFrame Functions
print(df.shape)
print(df.columns)
print(df.index)
print(df.head(n=2))
print(df.tail(n=2))

In [None]:
# Looping through a DataFrame
for index, row in df.iterrows():
    print(index, row["Name"], row["Age"], row["City"])

In [None]:
# Reading from a CSV file
df = pd.read_csv("data.csv")
print(df.head())

## Class Exercise
1. Create a class schedule using a dictionary, where the keys are the days of the week and the values are lists of classes. Then print out the classes given a specific day
2. Using the Pandas library, read a CSV file and display the data in a table format. Then filter the data based on certain criteria and display the flitered data. 