### Syntax

```python
new_list = [expression for item in iterable if condition]
```

- **`expression`**: The value or transformation to apply to each item in the iterable.
- **`item`**: A variable representing each element in the iterable.
- **`iterable`**: The source of elements (e.g., a list, range, or another iterable).
- **`condition`** *(optional)*: A filter to include only items that satisfy the condition.

### Assignment Creates a Reference, Not a Copy
When you assign a list to another variable, both variables point to the same memory location. Modifying one will affect the other.


In [61]:
list1 = [1, 2, 3]
list2 = list1  # Assign list1 to list2

list2[0] = 99  # Modify list2
print("list1:", list1)  # list1 is also modified
print("list2:", list2)

list1: [99, 2, 3]
list2: [99, 2, 3]


In [5]:
#slicing
list1 = [1, 2, 3]
list2 = list1[:]  # Create a shallow copy
list2[0] = 99
print("list1:", list1)  # list1 remains unchanged
print("list2:", list2)

#copy
import copy
list1 = [1, 2, 3]
list2 = copy.copy(list1)  # Create a shallow copy
list2[0] = 99
print("list1:", list1)  # list1 remains unchanged  they are both independent!
print("list2:", list2)

list1: [1, 2, 3]
list2: [99, 2, 3]
list1: [1, 2, 3]
list2: [99, 2, 3]


### Functions and lists
When passing a list to a function, the function works with the same object unless explicitly copied.  This concept is directly related to the idea of pass by reference versus pass by type.  You'll see an example of this below.



In [None]:
# pass by reference
def modify_list(lst):
    lst.append(4)
    lst[0]=2

my_list = [1, 2, 3]
modify_list(my_list)
print(my_list)  # The original list is modified



#pass by value
def changeMe(foo):
    foo=5

#this is not true with primitive types (int, float, etc)
x=4
changeMe(x)
print(x)

#PASS BY REFERENCE or PASS BY VALUE




### Sorting a List
You can sort a list in place using the `sort` method. For example:


In [67]:
list_ex5=[1,2,8,9,11,45,33,12,7,3,18,92,31,22, 4]
print(list_ex5)
list_ex5.sort()

#help(list)  #look at the params for sort
print(list_ex5)
list_ex5.sort(reverse=True) 
print(list_ex5) 



[1, 2, 8, 9, 11, 45, 33, 12, 7, 3, 18, 92, 31, 22, 4]
[1, 2, 3, 4, 7, 8, 9, 11, 12, 18, 22, 31, 33, 45, 92]
[92, 45, 33, 31, 22, 18, 12, 11, 9, 8, 7, 4, 3, 2, 1]


### List slicing
You can extract or modify a section of a list using slicing. To do this, pass the starting index (`start`) and stopping index (`stop`) as `start:stop:step` within square brackets.

In [69]:
list_ex6=[34, 242, 23,12, 67, 89, 223, 56, 99, 89,100]

# Extract a slice from index 2 to 5 (stop index is excluded)
print("list_ex6[2:6] ", list_ex6[2:6])

#Omit the `start` index to begin from the start and up to and not including the stop:
print("list_ex6[:7] ", list_ex6[:7])

#Omit the `stop` index to include all elements till the end:
print("list_ex6[2:] ",list_ex6[2:])

#Negative indices start counting from the other end..
print("list_ex6[-4:] ", list_ex6[-4:])

#Extract every third element
print("list_ex6[::3] ", list_ex6[::3])

#Reverse the list
print( "list_ex6[::-1] ",list_ex6[::-1])


list_ex6[2:6]  [23, 12, 67, 89]
list_ex6[:7]  [34, 242, 23, 12, 67, 89, 223]
list_ex6[2:]  [23, 12, 67, 89, 223, 56, 99, 89, 100]
list_ex6[-4:]  [56, 99, 89, 100]
list_ex6[::3]  [34, 12, 223, 89]
list_ex6[::-1]  [100, 89, 99, 56, 223, 89, 67, 12, 23, 242, 34]


### Lists versus Tuples in Python

While lists are more versatile and can often replace tuples in most scenarios, there are specific reasons to choose tuples over lists:

**Space**:  
     Lists require more memory because they are designed to grow dynamically. When a list is created, extra space is allocated to accommodate future additions. Tuples, being immutable, use less memory compared to lists of the same length.
     
**Efficiency**:  
     Tuples directly reference their elements, while lists use an additional layer of pointers to reference elements. This makes element retrieval faster for tuples.  However, other algorithms will run faster on lists.  It depends on what you are commonly doing to your data.


---

In [4]:
#Example showing tuples take less storage space than lists for the same elements
tuple_ex = (2, 4, 2, 'Data Analytics')
list_ex = [2, 4, 2, 'Data Analytics']
print("Space taken by tuple =",tuple_ex.__sizeof__()," bytes")
print("Space taken by list =",list_ex.__sizeof__()," bytes")

Space taken by tuple = 56  bytes
Space taken by list = 72  bytes


In [1]:
#Tuples are faster because of direct reference
import time as t
# Retrieving elements from a list
tt = t.time()
list_ex = list(range(1000000))  # List containing integers up to 1 million
result = list_ex[::-2]
print("Time taken to retrieve every 2nd element from a list =", t.time() - tt)

# Retrieving elements from a tuple
tt = t.time()
tuple_ex = tuple(range(1000000))  # Tuple containing integers up to 1 million
result = tuple_ex[::-2]
print("Time taken to retrieve every 2nd element from a tuple =", t.time() - tt)



Time taken to retrieve every 2nd element from a list = 0.016310930252075195
Time taken to retrieve every 2nd element from a tuple = 0.01232004165649414


In [1]:
#Less Memory example
tuple_ex = (2, 4, 2, 'Data')
list_ex = [2, 4, 2, 'Data']
print("Size of tuple =", tuple_ex.__sizeof__(), "bytes")
print("Size of list =", list_ex.__sizeof__(), "bytes")




Size of tuple = 56 bytes
Size of list = 72 bytes


***Tuples verus Lists:*** By understanding the differences between lists and tuples, you can make an informed decision about which data structure to use based on your specific needs. Tuples are the better choice when memory efficiency and faster access are priorities, while lists are ideal for scenarios requiring flexibility and dynamic modifications.

In [11]:
#Examples showing a tuples are not copied, while lists can be copied
tuple_cpy = tuple(tuple_ex)
print("Is tuple_copy same as tuple_ex?", tuple_ex is tuple_cpy)
list_cpy = list(list_ex)
print("Is list_copy same as list_ex?",list_ex is list_cpy)

Is tuple_copy same as tuple_ex? True
Is list_copy same as list_ex? False


## Practice

In [37]:
#Make a list of 100 random integers between 0 and 100
import random
mylist=[]
for i in range(100):
    mylist.append(random.randint(0,100))
#print(mylist)

mylist2=[random.randint(0,100) for i in range(100) ]
print(mylist2)


#Add [44, 34, 66] to the list
mylist2=mylist2+[44, 34, 66]
print(mylist2)
#mylist2.append([44, 34, 66])  #doesn't behave like we wanted
mylist2.append(8)
#print(mylist2)


#Insert a 75 ad index 8
mylist2.insert(8, 75)
#print(mylist2)


#Double the elements in the list (concatenate it onto itself)
mylist3=mylist2*2
mylist3=mylist2+mylist2
#print(mylist3)

#count the number of elements greater than 50
num= len([x for x in mylist3 if x>50])
print(num)



#Print out the average of the list
avg=sum(mylist3)/len(mylist3)
print(avg)



#What is the difference between list object methods and functions that take in lists??  Use the sum function and list count method as an example.



[67, 49, 85, 30, 70, 65, 34, 68, 19, 3, 69, 53, 85, 44, 76, 75, 65, 9, 24, 30, 0, 85, 60, 28, 76, 79, 53, 52, 64, 79, 71, 23, 51, 85, 45, 23, 29, 24, 92, 84, 69, 70, 75, 82, 4, 1, 4, 48, 88, 87, 35, 44, 4, 36, 93, 59, 44, 55, 8, 63, 2, 92, 82, 88, 74, 64, 86, 91, 70, 76, 92, 53, 21, 65, 0, 74, 69, 80, 40, 36, 69, 71, 98, 29, 8, 73, 51, 70, 12, 86, 52, 21, 74, 13, 30, 48, 5, 17, 81, 75]
[67, 49, 85, 30, 70, 65, 34, 68, 19, 3, 69, 53, 85, 44, 76, 75, 65, 9, 24, 30, 0, 85, 60, 28, 76, 79, 53, 52, 64, 79, 71, 23, 51, 85, 45, 23, 29, 24, 92, 84, 69, 70, 75, 82, 4, 1, 4, 48, 88, 87, 35, 44, 4, 36, 93, 59, 44, 55, 8, 63, 2, 92, 82, 88, 74, 64, 86, 91, 70, 76, 92, 53, 21, 65, 0, 74, 69, 80, 40, 36, 69, 71, 98, 29, 8, 73, 51, 70, 12, 86, 52, 21, 74, 13, 30, 48, 5, 17, 81, 75, 44, 34, 66]
124
52.923809523809524


# **Dictionaries**

A dictionary in Python consists of **key-value pairs**, where the keys and values are Python objects. The keys must be immutable types (e.g., strings, integers, tuples), while the values can be of any type. For example, a list can be a value but cannot serve as a key, as lists are mutable.

A dictionary can be defined using curly braces `{}` or the `dict()` function, with colons `:` separating keys and values and commas `,` separating key-value pairs:


In [67]:
student_grades = {
    "Sophie": 85,
    "GiGi": 92,
    "Lucy": 78,
    "Tilly": 95,
    "Buddy": 88
}



You can retrieve a value from the dictionary using its key.  

In [41]:
print(student_grades["Buddy"])

88


In [43]:
# Linking a string to multiple values using a tuple
student_info = {
    'Keyshawn': ('Math', 85, 'A'),
    'Jing': ('Science', 92, 'A'),
    'Darrel': ('History', 78, 'B'),
}

# Accessing the linked values

print(student_info['Jing'])
print("Keyshawn's subject:", student_info['Keyshawn'][0]) 
print("Keyshawn's grade:", student_info['Keyshawn'][1])  
print("Keyshawn's letter grade:", student_info['Keyshawn'][2]) 

('Science', 92, 'A')
Keyshawn's subject: Math
Keyshawn's grade: 85
Keyshawn's letter grade: A


### Adding and Removing Elements in a Dictionary

* ***Adding Elements:*** New elements can be added to a dictionary by assigning a value to a new key:

* ***Removing Elements:***
You can remove elements from a dictionary using either the `del` statement or the `pop()` method:

In [69]:
student_grades["Ollie"]=99
student_grades["Test Student"]="nothing"   #notice that this work but the types are different for the value in the dictionary.
print(student_grades)


{'Sophie': 85, 'GiGi': 92, 'Lucy': 78, 'Tilly': 95, 'Buddy': 88, 'Ollie': 99, 'Test Student': 'nothing'}


In [71]:
del student_grades['Test Student']


In [73]:
print(student_grades)

{'Sophie': 85, 'GiGi': 92, 'Lucy': 78, 'Tilly': 95, 'Buddy': 88, 'Ollie': 99}


In [75]:
# Remove element with "Buddy" and return it as a variable
dog= student_grades.pop("Buddy")
print(student_grades)
#note if you run this again buddy isn't there

{'Sophie': 85, 'GiGi': 92, 'Lucy': 78, 'Tilly': 95, 'Ollie': 99}


In [77]:
print(dog)

88


In [79]:
# You can update values using update

student_grades.update({"Tilly":100})   #careful with the syntax
print(student_grades)

{'Sophie': 85, 'GiGi': 92, 'Lucy': 78, 'Tilly': 100, 'Ollie': 99}


In [56]:
###  Iterating over a dictionary
for key, value in student_grades.items():
    print(f"{key} had a final grade of {value}")


#Checking if a key is in the dictionary
if "Sophie" in student_grades:
    print("Sophie is there")

Sophie had a final grade of 85
GiGi had a final grade of 92
Lucy had a final grade of 78
Tilly had a final grade of 100
Ollie had a final grade of 99
Sophie is there


### Practice Exercise

We'll go back to our GDP example!   Now, in the cell below creates a dictionary representing the The GDP per capita of USA for most years from 1960 to 2021.

In the cell below do the following:
   1.  Print the GDP per capita in 2015
   2.  The year 2014 is missing from the dataset.   The GDP for that year is the average of 2013 and 2015.
   3.  There are other missing years, but no consecutive missing years.  Please find the missing years and use the average between the surrounding years as the values for those years. 

In [83]:
dict_GDP = {'1960':3007,'1961':3067,'1962':3244,'1963':3375,'1964':3574,'1965':3828,'1966':4146,'1967':4336,'1968':4696,'1970':5234,'1971':5609,'1972':6094,'1973':6726,'1974':7226,'1975':7801,'1976':8592,'1978':10565,'1979':11674, '1980':12575,'1981':13976,'1982':14434,'1983':15544,'1984':17121,'1985':18237,  '1986':19071,'1987':20039,'1988':21417,'1989':22857,'1990':23889,'1991':24342,  '1992':25419,'1993':26387,'1994':27695,'1995':28691,'1996':29968,'1997':31459,  '1998':32854,'2000':36330,'2001':37134,'2002':37998,'2003':39490,'2004':41725,  '2005':44123,'2006':46302,'2007':48050,'2008':48570,'2009':47195,'2010':48651,  '2011':50066,'2012':51784,'2013':53291,'2015':56763,'2016':57867,'2017':59915,'2018':62805, '2019':65095,'2020':63028,'2021':69288}

In [103]:
#notice that the years are strings...
#1
print(dict_GDP['2015'])

#2
val=int((dict_GDP['2013']+dict_GDP['2015'])/2)
print(val)
dict_GDP['2014']=val
print(dict_GDP['2014'])  #make sure it's there
#3
for i in range(1960, 2021):
    print(str(i))
    if (str(i)) in dict_GDP:
        print(i)



56763
55027
55027
1960
1960
1961
1961
1962
1962
1963
1963
1964
1964
1965
1965
1966
1966
1967
1967
1968
1968
1969
1970
1970
1971
1971
1972
1972
1973
1973
1974
1974
1975
1975
1976
1976
1977
1978
1978
1979
1979
1980
1980
1981
1981
1982
1982
1983
1983
1984
1984
1985
1985
1986
1986
1987
1987
1988
1988
1989
1989
1990
1990
1991
1991
1992
1992
1993
1993
1994
1994
1995
1995
1996
1996
1997
1997
1998
1998
1999
2000
2000
2001
2001
2002
2002
2003
2003
2004
2004
2005
2005
2006
2006
2007
2007
2008
2008
2009
2009
2010
2010
2011
2011
2012
2012
2013
2013
2014
2014
2015
2015
2016
2016
2017
2017
2018
2018
2019
2019
2020
2020


In [74]:
#Putting it all together...

#What would be the best way to model a deck of cards?   Maybe a list of tuples?

deck = [(i, c) for c in ['spades', 'clubs', 'hearts', 'diamonds']  for i in range(2,15)]
print(deck)


# You could also make it list of  dictionaries with strings to grab the "suit" "value" at any time
deck2 = [{"value":i, "suit":c} for c in ['spades', 'clubs', 'hearts', 'diamonds']  for i in range(2,15)]
print(deck2)

#Discuss advantages of both methods


[(2, 'spades'), (3, 'spades'), (4, 'spades'), (5, 'spades'), (6, 'spades'), (7, 'spades'), (8, 'spades'), (9, 'spades'), (10, 'spades'), (11, 'spades'), (12, 'spades'), (13, 'spades'), (14, 'spades'), (2, 'clubs'), (3, 'clubs'), (4, 'clubs'), (5, 'clubs'), (6, 'clubs'), (7, 'clubs'), (8, 'clubs'), (9, 'clubs'), (10, 'clubs'), (11, 'clubs'), (12, 'clubs'), (13, 'clubs'), (14, 'clubs'), (2, 'hearts'), (3, 'hearts'), (4, 'hearts'), (5, 'hearts'), (6, 'hearts'), (7, 'hearts'), (8, 'hearts'), (9, 'hearts'), (10, 'hearts'), (11, 'hearts'), (12, 'hearts'), (13, 'hearts'), (14, 'hearts'), (2, 'diamonds'), (3, 'diamonds'), (4, 'diamonds'), (5, 'diamonds'), (6, 'diamonds'), (7, 'diamonds'), (8, 'diamonds'), (9, 'diamonds'), (10, 'diamonds'), (11, 'diamonds'), (12, 'diamonds'), (13, 'diamonds'), (14, 'diamonds')]
[{'value': 2, 'suit': 'spades'}, {'value': 3, 'suit': 'spades'}, {'value': 4, 'suit': 'spades'}, {'value': 5, 'suit': 'spades'}, {'value': 6, 'suit': 'spades'}, {'value': 7, 'suit': 'spa

### Pratice Exercise
To represent Chick-fil-A's nutritional information in a Python dictionary, you can structure it with menu items as keys and their corresponding nutritional details as values (also stored as a dictionary). Here's an example:

In [102]:
chickfila_nutrition = {
    'Chick-fil-A Filet': {
        'Calories': 260,
        'Fat': 12,
        'Carbs': 13,
        'Protein': 25
    },
    'Chicken Sandwich': {
        'Calories': 440,
        'Fat': 19,
        'Carbs': 40,
        'Protein': 28
    },
    'Grilled Chicken Sandwich': {
        'Calories': 320,
        'Fat': 6,
        'Carbs': 39,
        'Protein': 29
    },
    'Chicken Nuggets (8-count)': {
        'Calories': 250,
        'Fat': 12,
        'Carbs': 11,
        'Protein': 14
    },
    'Grilled Chicken Nuggets (8-count)': {
        'Calories': 140,
        'Fat': 3,
        'Carbs': 2,
        'Protein': 26
    },
    'Waffle Potato Fries (Medium)': {
        'Calories': 400,
        'Fat': 24,
        'Carbs': 45,
        'Protein': 5
    },
    'Side Salad': {
        'Calories': 80,
        'Fat': 5,
        'Carbs': 7,
        'Protein': 2
    },
    'Icedream Cone': {
        'Calories': 200,
        'Fat': 7,
        'Carbs': 31,
        'Protein': 4
    }
}

print(type(chickfila_nutrition))


<class 'dict'>


In [140]:
#Print out the number of calories in a 'Side Salad'
print(chickfila_nutrition['Side Salad'])

print(chickfila_nutrition['Side Salad']['Calories'])


highest_protein=0
highest_item=""
#Find and printout the food with the highest protein.
for key, val in chickfila_nutrition.items():
   #print(val['Protein'])  #just print out the protein

   if (val['Protein']>highest_protein):
       highest_protein=val['Protein']
       highest_item=key

print(highest_protein,highest_item)


#you could also create a new dictionary that stores only the item and the protien value then grab the max of the values

protein_only={}
for key, val in chickfila_nutrition.items():
    protein_only[key]=val['Protein']
      
print(protein_only)
print("Using max on the values:", max(protein_only.values()))
print([key for key, val in chickfila_nutrition.items() if val['Protein']==max(protein_only.values())])

#the second version is probably better because it will identify if there are mutilple items with a maximum value
#However, I really should be careful when I'm calling a function like max more than once.  It would probaby be better to 
#store the value in an intermediate variable so that the value function is only executed once, making my code more efficient.







{'Calories': 80, 'Fat': 5, 'Carbs': 7, 'Protein': 2}
80
29 Grilled Chicken Sandwich
{'Chick-fil-A Filet': 25, 'Chicken Sandwich': 28, 'Grilled Chicken Sandwich': 29, 'Chicken Nuggets (8-count)': 14, 'Grilled Chicken Nuggets (8-count)': 26, 'Waffle Potato Fries (Medium)': 5, 'Side Salad': 2, 'Icedream Cone': 4}
Using max on the values: 29
['Grilled Chicken Sandwich']


***Practice*** Which menu items have a fat content of less or equal to 10?