## Week 3: Data Structures in Python

By the end of this week, you will be able to:
- Use lists and dictionaries to store and organize data
- Use more complicated nested data structures to solve Python problems

## Lists and Dictionaries

### Creation, indexing, and manipulation

Lists and dictionaries are among the most frequently used data structures in Python.

In [1]:
# Creating lists
list1 = [1, 2, 3]
list2 = ['a', 'b', 'c']
list3 = [1, 'a', 2.5]
list4 = ["Messi", "Ronaldo", "Mbappe", "Haaland", "Lewandowski"]


# Creating dictionaries
dict1 = {'key1': 'value1', 'key2': 'value2'}
dict2 = {1: 'apple', 2: 'banana'}
dict3 = {'name': 'John', 'age': 30}
dict4 = {'Messi': 30, 'Ronaldo': 20, 'Mbappe': 10}

# What is another dictionary we could make?

#### Accessing Elements in Lists

In [2]:
# Let's consider a list of top goal scorers for a season.
top_goal_scorers = ["Messi", "Ronaldo", "Mbappe", "Haaland", "Lewandowski"]

# To access the first goal scorer:
first_scorer = top_goal_scorers[0]
print(f"The top goal scorer is {first_scorer}.")  # Output: The top goal scorer is Messi.

# How would we access the second scorer?


The top goal scorer is Messi.


In [3]:
# To access the last goal scorer:
last_scorer = top_goal_scorers[-1]
print(f"The last top goal scorer is {last_scorer}.")  # Output: The fifth top goal scorer is Lewandowski.

# How would we access the third top goal scorer?

The last top goal scorer is Lewandowski.


#### Modifying Elements in Lists

In [4]:
# Suppose Haaland overtakes Mbappe in the goal-scoring chart.
top_goal_scorers[2] = "Haaland"
top_goal_scorers[3] = "Mbappe"

# Now the list looks like:
print(top_goal_scorers)  # Output: ['Messi', 'Ronaldo', 'Haaland', 'Mbappe', 'Lewandowski']


['Messi', 'Ronaldo', 'Haaland', 'Mbappe', 'Lewandowski']


#### Appending and Deleting Elements in Lists

In [5]:
# Appending a new player
top_goal_scorers.append("Neymar")

top_goal_scorers

top_goal_scorers.append("Kane")

top_goal_scorers

# Deleting the last player
# del top_goal_scorers[-1]


['Messi', 'Ronaldo', 'Haaland', 'Mbappe', 'Lewandowski', 'Neymar', 'Kane']

#### Using Lists and Loops

In [6]:
for playerName in top_goal_scorers:
    print(f'Player is: {playerName}')

Player is: Messi
Player is: Ronaldo
Player is: Haaland
Player is: Mbappe
Player is: Lewandowski
Player is: Neymar
Player is: Kane


In [11]:
topScorerListOfLists = [["Messi", 50], ["Ronaldo", 43], 
                    ["Mbappe", 41], ["Haaland", 28], ["Lewandowski", 25]]

for playerStat in topScorerListOfLists:
    print(playerStat)

# How do we only print the number of goals?

['Messi', 50]
['Ronaldo', 43]
['Mbappe', 41]
['Haaland', 28]
['Lewandowski', 25]


In [8]:
for index, _ in enumerate(topScorerListOfLists):
    print(f'Index of list is: {index}, Value is {topScorerListOfLists[index]}')

Index of list is: 0, Value is ['Messi', 50]
Index of list is: 1, Value is ['Ronaldo', 43]
Index of list is: 2, Value is ['Mbappe', 41]
Index of list is: 3, Value is ['Haaland', 28]
Index of list is: 4, Value is ['Lewandowski', 25]


In [9]:
# Complete the function that takes in a list of numbers, and returns the maximum
# Use if/else, for or while loops and other skills you have learned
# Example input: [1, 4, 9, 3, 2] Output: 9
def findMax(listOfValues):
    # your code here

SyntaxError: unexpected EOF while parsing (<ipython-input-9-7914356ef7f9>, line 5)

#### Accessing Elements in Dictionaries

In [8]:
# Let's consider a dictionary of player statistics.
player_stats = {
    "Messi": {"goals": 50, "assists": 20},
    "Ronaldo": {"goals": 45, "assists": 15},
    "Mbappe": {"goals": 40, "assists": 18}
}

# To access Messi's goals:
messi_goals = player_stats["Messi"]["goals"]
print(f"Messi's goals: {messi_goals}")  # Output: Messi's goals: 50


Messi's goals: 50


In [None]:
team_matchup_stats = {}

team_matchup_stats['Barcelona'] = []

team_matchup_stats

{'Barcelona': []}

In [None]:
#Habet: list comprehension

List comprehensions in Python offer a concise and readable way to create lists. 

It involves framing a `for` loop along with an optional `if` condition inside square brackets. 

This method not only simplifies the code but also optimizes it for better performance.

How does list comprehension work?

[`expression for item in iterable if condition`]


Why Use List Comprehensions?

- **Readability:** They make the code more readable and straightforward.
- **Efficiency:** Often faster than traditional for loops and manual list appending.
- **Versatility:** Useful for creating new lists where each element is the result of some operation applied to each member of another sequence or iterable.

Let us go back to our list, we want to extrct the names of the players and save them in a seperate list


Do the same with list comprehension

In [12]:
topScorerListOfLists

[['Messi', 50],
 ['Ronaldo', 43],
 ['Mbappe', 41],
 ['Haaland', 28],
 ['Lewandowski', 25]]

In [13]:
player_names = [player[0] for player in topScorerListOfLists]
player_names

['Messi', 'Ronaldo', 'Mbappe', 'Haaland', 'Lewandowski']

The same with the loop:

In [18]:
player_names = []
for player in topScorerListOfLists:
    player_names.append(player[0])
player_names

['Messi', 'Ronaldo', 'Mbappe', 'Haaland', 'Lewandowski']

Now we can add if condition, to extract only those players, who scored more than 30 goals 

In [15]:
player_names = [player[0] for player in topScorerListOfLists if player[1]>30]
player_names

['Messi', 'Ronaldo', 'Mbappe']

The same with the loop

In [19]:
player_names = []
for player in topScorerListOfLists:
    if player[1] > 30:
        player_names.append(player[0])
player_names

['Messi', 'Ronaldo', 'Mbappe']

More complex example

Suppose you have a list where each element is a tuple containing a football player's name and their match ratings over few games. 

Your goal is to calculate the average rating for each player and create a list of tuples with the player's name and their average rating. However, you only want to include players whose average rating is above a certain threshold, say 7.0.

In [20]:
player_ratings = [
    ('Messi', [8.5, 9.0, 7.5, 8.0]),
    ('Ronaldo', [7.0, 6.5, 6.0, 7.5]),
    ('Mbappe', [7.5, 8.0, 7.0, 8.5]),
    ('Haaland', [6.5, 7.0, 6.5, 6.0]),
    ('Lewandowski', [8.0, 9.0, 8.5, 9.0])
]

In [22]:
top_performers = []
for player, ratings in player_ratings:
    average_rating = sum(ratings) / len(ratings)
    if average_rating > 7.0:
        top_performers.append((player, average_rating))
top_performers

[('Messi', 8.25), ('Mbappe', 7.75), ('Lewandowski', 8.625)]

The same with the list comprehensions

In [27]:
top_performers = [(player, sum(ratings) / len(ratings)) for player, 
                  ratings in player_ratings if sum(ratings) / len(ratings) > 7.0]
top_performers

[('Messi', 8.25), ('Mbappe', 7.75), ('Lewandowski', 8.625)]

## Task


For the given list of dictionaries

do the following:

**Top Goal Scorers:** Use list comprehension to create a list of names of players who scored more than 5 goals in the tournament.

**Players with More Assists than Goals:** Create a list of players whose number of assists is greater than their number of goals. Format each element as a string: "Player Name - Team".

**Most Valuable Player:** Generate a list with total contribution of the players (goals + assists). This is a list of lists with the name of the player and its MVP score.

In [31]:
player_stats = [
    {'name': 'Messi', 'team': 'FC Barcelona', 'goals': 5, 'assists': 7},
    {'name': 'Ronaldo', 'team': 'Juventus', 'goals': 6, 'assists': 5},
    {'name': 'Neymar', 'team': 'PSG', 'goals': 2, 'assists': 9},
    {'name': 'Lewandowski', 'team': 'Bayern Munich', 'goals': 8, 'assists': 3},
]



In [None]:
# What is Barcelona played Real Madrid and Athletico Madrid? How would we add that?


# We have a new team to add; Tottenham has played Manchester United and Arsenal

### Time complexity and efficient operations

Understanding time complexity can help in writing efficient code. For instance, list append operations are generally O(1), whereas list insert operations can be O(n).

**Mini-Project**: Building a function that manages a list of football players and their statistics.

# Tuples

Tuples are one of the basic data structures in Python. They are similar to lists, but with a key difference: tuples are immutable. This means once a tuple is created, its contents cannot be modified. 

Tuples are commonly used for data that should not change throughout the execution of a program, thus ensuring data integrity.

Defining a Tuple:

- Tuples are defined by enclosing the elements in parentheses ().

- A `tuple` can contain elements of different data types, including other tuples.

In [41]:
# player info as a list
player_info = ['Messi', 34, 'Forward']
player_info

['Messi', 34, 'Forward']

In [42]:
# Change modify the second value in the list
player_info[1] = 40
player_info

['Messi', 40, 'Forward']

Now lets create a tuple

In [44]:
player_info = ('Messi', 34, 'Forward')
player_info

('Messi', 34, 'Forward')

In [None]:
# Try to modify its content
player_info[1] = 40

Characteristics of Tuples

1. Immutability: Once a tuple is created, you cannot add, remove, or modify its elements.

2. Indexing and Slicing: Similar to lists, you can access elements by indexing and slice tuples. Syntax is the same as that of lists.

In [45]:
player_info[1]

34

You can loop through a tuple using a for loop, similar to a list.

In [50]:
for item in player_info:
    print(item)

Messi
34
Forward


**Tuple Unpacking:** Python allows you to unpack the tuple into variables in a convenient way.

In [51]:
name, age, position = player_info
name

'Messi'

In [52]:
age

34

**Tuple Methods**

Tuples have fewer methods compared to lists, due to their immutability. Key methods include:

- .count(value): Returns the number of times value appears in the tuple.
- .index(value): Returns the first index of value.

In [55]:
players = ("CF", "MF", "CF", "D", "GK", "CF", "MF", "MF", "DF")

In [56]:
players.count('CF')

3

Returns only the first occurrence of a specified value

In [57]:
players.index('CF')

0

## Sets

A set in Python is an unordered collection of unique elements. 
Sets are mutable and can be modified after their creation, but they can only contain unique items (no duplicates).

Creating a Set:

Defined by curly braces {} or using the set() function.

Example: `teams = {'Barcelona', 'Real Madrid', 'Liverpool'}.`


In [59]:
teams = {'Barcelona', 'Real Madrid', 'Liverpool'}
teams

{'Barcelona', 'Liverpool', 'Real Madrid'}

What happens when you add duplicated values

In [62]:
teams = {'Barcelona', 'Real Madrid', 'Liverpool', 'Liverpool'}
teams

{'Barcelona', 'Liverpool', 'Real Madrid'}

If you think sets are like dictionaries, think twice!

## Basic Operations with Sets

- **Adding Elements:** Use add() to add a single item.
- **Removing Elements:** Use remove() or discard() to delete items.
- **Set Operations:** Perform union, intersection, and difference operations, useful in comparative analysis.

In [20]:
real_madrid = {
    'Karim Benzema', 'Luka Modric', 'Toni Kroos', 
    'Eden Hazard', 'Thibaut Courtois', 'Vinicius Junior',
    'Luis Figo', 'Michael Laudrup', 'Ronaldo Nazario', 'Luis Enrique', 'Gheorghe Hagi'
}

fc_barcelona = {
    'Lionel Messi', 'Xavi Hernandez', 'Andres Iniesta', 'Luis Enrique',
    'Gerard Pique', 'Carles Puyol', 'Ronaldinho', 'Robert Lewandowski',
    'Luis Figo', 'Gheorghe Hagi', 'Michael Laudrup', 'Ronaldo Nazario'
}


Get the players who played both for Real Madrid and Barcelona (Intersection)

In [21]:
real_madrid.intersection(fc_barcelona)

{'Gheorghe Hagi',
 'Luis Enrique',
 'Luis Figo',
 'Michael Laudrup',
 'Ronaldo Nazario'}

Players who played for Real Madrid but not for FC Barcelona

In [23]:
real_madrid.difference(fc_barcelona)

{'Eden Hazard',
 'Karim Benzema',
 'Luka Modric',
 'Thibaut Courtois',
 'Toni Kroos',
 'Vinicius Junior'}

Players who played for FC Barcelona but not for Real Madrid

In [None]:
## your code here

Lets recall main concepts from OOP, particularly, class and method and lets create a class player

In [45]:
class Player:
    def __init__(self, last_name, first_name, goals):
        self.first_last_name = (last_name, first_name)        
        self.goals = goals

Create an object Messi

In [47]:
messi = Player("Messi", "Lionel",  25)
messi.first_last_name

('Messi', 'Lionel')

Extract the first name

In [48]:
messi.first_last_name[0]

'Messi'

Can you try to change it ?

In [49]:
messi.first_last_name[0] = 'Habet'

TypeError: 'tuple' object does not support item assignment

However

In [50]:
messi.first_last_name = ('Habet', 'Madoyan')
messi.first_last_name

('Habet', 'Madoyan')

Why does this happen ?

In your Player class, first_last_name is indeed initialized as a tuple, which is an immutable data structure.

However, when you execute messi.first_last_name = ('Habet', 'Madoyan'), you are not mutating the tuple itself; rather, you are reassigning the first_last_name attribute of the messi object to point to a new object, which is the string ('Habet', 'Madoyan').

maybe we need also to talk about lambda functions ?