# Python Lists and Data Structures

This notebook covers key Python data structures used in data science:
- Lists
- Dictionaries
- Tuples
- Sets

## Lists

Lists are ordered, mutable collections that can store mixed data types.

In [None]:
# Creating and manipulating lists
languages = ['Python', 'R', 'SQL', 'Java']
print(f"Original list: {languages}")

# Adding elements
languages.append('Julia')
print(f"After append: {languages}")

# Indexing
print(f"First element: {languages[0]}")
print(f"Last element: {languages[-1]}")

# Slicing
print(f"Elements 1-3: {languages[1:3]}")
print(f"All elements except the last two: {languages[:-2]}")
print(f"Every second element: {languages[::2]}")

In [None]:
# List methods
languages = ['Python', 'R', 'SQL', 'Java']

# Insert at specific position
languages.insert(1, 'JavaScript')
print(f"After insert: {languages}")

# Remove an element
languages.remove('SQL')
print(f"After remove: {languages}")

# Pop an element (removes and returns)
popped = languages.pop(2)
print(f"Popped element: {popped}")
print(f"After pop: {languages}")

# Sort the list
languages.sort()
print(f"After sorting: {languages}")

# Reverse the list
languages.reverse()
print(f"After reversing: {languages}")

## Dictionaries

Dictionaries are mutable collections of key-value pairs. Keys must be immutable and unique.

In [None]:
# Creating and accessing dictionaries
data_scientist = {
    'name': 'Alex',
    'skills': ['Python', 'Machine Learning', 'SQL'],
    'experience_years': 4,
    'education': {'degree': 'Master\'s', 'field': 'Data Science'}
}

# Accessing values
print(f"Name: {data_scientist['name']}")
print(f"First skill: {data_scientist['skills'][0]}")
print(f"Degree: {data_scientist['education']['degree']}")

# Adding and modifying entries
data_scientist['location'] = 'New York'
data_scientist['experience_years'] = 5

print(f"Updated dictionary: {data_scientist}")

# Dictionary methods
print(f"Keys: {list(data_scientist.keys())}")
print(f"Values: {list(data_scientist.values())}")
print(f"Items: {list(data_scientist.items())}")

## Tuples

Tuples are immutable, ordered collections. They're often used to store related pieces of data.

In [None]:
# Creating and using tuples
point = (3, 4)
person = ('John', 'Doe', 35, 'Developer')

# Unpacking tuples
x, y = point
print(f"x={x}, y={y}")

first_name, last_name, age, occupation = person
print(f"{first_name} {last_name} is {age} years old and works as a {occupation}")

# Tuples are immutable
try:
    person[0] = 'Jane'  # This will raise an error
except TypeError as e:
    print(f"Error: {e}")

## Sets

Sets are unordered collections of unique elements. They're useful for membership testing and eliminating duplicates.

In [None]:
# Creating and using sets
fruits = {'apple', 'banana', 'orange', 'apple', 'pear'}
print(f"Fruits set (notice no duplicates): {fruits}")

vegetables = {'carrot', 'lettuce', 'peas'}

# Set operations
print(f"'apple' in fruits: {'apple' in fruits}")
fruits.add('grape')
print(f"After adding 'grape': {fruits}")
fruits.remove('banana')
print(f"After removing 'banana': {fruits}")

# Set operations with multiple sets
all_foods = fruits.union(vegetables)
print(f"All foods (union): {all_foods}")

## Mini Practice Task

Create a dictionary representing a dataset with keys for 'name', 'rows', 'columns', and 'missing_values'. Write code to add a new key 'data_types' with a list of common data types.

In [None]:
# Your solution here
dataset = {
    'name': 'customer_data',
    'rows': 1500,
    'columns': 8,
    'missing_values': 45
}

# Add new key 'data_types' with a list of common data types
dataset['data_types'] = ['int', 'float', 'str', 'datetime', 'categorical']

print("Dataset information:")
for key, value in dataset.items():
    print(f"{key}: {value}")