<a href="https://colab.research.google.com/github/K-Dwivedi/Code-Division--Python-worksheets/blob/main/Lists_and_Tuples.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lists and tuples

Often we need to store a number of single items of data together so that they can be processed together. This might be because all the data refers to one person (e.g. name, age, gender, etc) OR it might be because we have a set of data (e.g. all the items that should be displayed in a drop down list, such as all the years from this year back to 100 years ago so that someone can select their year of birth)

Python has a range of data structures available including:
*   lists  
*   tuples  
*   dictionaries  
*   sets

This worksheet looks at lists and tuples.

## List
A list is a set of related, individual data objects, that are indexed and can be processed as a whole, as subsets or as individual items.  Lists are stored, essentially, as contiguous items in memory so that access can be as quick as possible.  However, they are mutable (they can be changed after they are created and stored) and so those mechanisms need to include extra functionality to deal with changing list sizes.

## Tuple
Essentially the same as a list but it is immutable.  Once it has been created it can't be changed.  It is stored in memory as contiguous items, with the size required being fixed right from the start.  This makes it faster to access.

The code below will create two lists and a tuple.
*   the first list contains 1000 random numbers between 1 and 100
*   the second list is of random length (up to 5000) and each item is one of the 9 characteristics that are protected under the Equality Act in the UK.
*   the tuple contains the 9 protected characteristics

Before you start the exercises, run the code below.  It will generate the lists and tuple so that you can use them in the exercises.  If you need to recreate the lists again (because you have changed them and need to work on the originals, just run this cell again).

***Note:***  *a list variable contains a reference to the start of the list in memory, rather than storing the list itself.  This means that if you assign the list to another variable (to make a copy), it will only copy across the reference.  If you change the copy, you change the original list.*

*If you need to make a copy of the list you will need to use a loop to create a new list and copy all items across.*

In [53]:
from random import randint, choice

def get_num_list():
  num_list = [randint(1,100) for n in range(1000)]
  return num_list

def get_protected_characteristics():
  characteristics_tuple = ('age','disability','gender reassignment','marriage and civil partnership','pregnancy and maternity','race','religion or belief','sex','sexual orientation')
  return characteristics_tuple

def get_protected_characteristic_list(protected_characteristics):
  char_list = [choice(protected_characteristics) for ch in range(randint(1,5000))]
  return char_list

nums = get_num_list()
protected_characteristics = get_protected_characteristics()
characteristics = get_protected_characteristic_list(protected_characteristics)

## The exercises below will use the lists:  
*   **nums** (a list of between 1 and 1000 random numbers, each number is between 0 and 1000)
*   **characteristics** (a list of 5000 random protected_characteristics)

and the tuple:
*  **protected_characteristics** (a set of the 9 protected characteristics identified in the Equality Act)

## You can run the cell above any number of times to generate new lists.

---
### Exercise 1 - list head, tail and shape

Write a function, **describe_list()** which will:
*  print the length of the list `nums`
*  print the first 10 items in `nums`  
*  print the last 5 items in `nums`

In [83]:
def describe_list(nums):


#print the length of the list nums
  length_of_the_list=len(nums)
print(len(nums))


#print the first 10 items in nums
first_10_items=nums[:10]
print(first_10_items)


#print the last 5 items in nums
last_5_items=nums[-5:]
print(last_5_items)

describe_list(nums)

1000
[53, 1, 15, 100, 74, 97, 81, 15, 70, 1]
[36, 63, 86, 21, 6]


---
### Exercise 2 - show tuple items

Write a function which will:
*   use a loop to print the list of protected characteristics from the `protected_characteristics` tuple.


In [82]:
def tuple_items():


# use a loop to print the list of protected characteristics from the protected_characteristics tuple

 list_of_tuple=(protected_characteristics)
 for tuple_items in list_of_tuple:
  print(tuple_items,",", end=" ")
 print()

tuple_items()

age , disability , gender reassignment , marriage and civil partnership , pregnancy and maternity , race , religion or belief , sex , sexual orientation , 


---
### Exercise 3 - list a random subset

Write a function which will:
*  calculate the position of the middle item in the `characteristics` list   
(*Hint: use len() to help with this*)
*  calculate the position of the item that is 5 places before the middle item
*  calculate the position of the item that is 5 places after the middle item
*  print the part of the list that includes the items from 5 places before to 5 places after.  

Expected output:  
Your list will include 11 items.

In [81]:
def random_subset():


#1- calculating the position of the middle item in the characteristics list

 my_list=len(characteristics)
 middle_item=my_list//2
 print(middle_item)

#2-calculating the position of the item that is 5 places before the middle item
 before_middle=middle_item-5
 print(before_middle)

#3-calculating the position of the item that is 5 places after the middle item
 after_middle=middle_item+5
 print(after_middle)


#4-to print the part of the list that includes the items from 5 places before to 5 places after.
 new_list=characteristics[before_middle:after_middle+1]
 print(new_list)


random_subset()


2477
2472
2482
['sex', 'religion or belief', 'race', 'race', 'gender reassignment', 'sexual orientation', 'age', 'religion or belief', 'religion or belief', 'sexual orientation', 'race']


---
### Exercise 4 - create a copy

Write a function which will: use a for loop to create a copy of the `nums` list:

*   create a new, empty, list called **new_nums**  (*Hint: an empty list is [ ]*)
*   use a for loop which uses the following syntax:  `for num in nums:`
*   each time round the loop append `num` to `new_nums`  ( *`new_nums.append(num)`*)
*   print the first 10 items of `new_nums`
*   print the first 10 items of `nums`
*   print the length of both lists

In [80]:
def create_a_copy():


 # creating a new, empty, list called new_nums
 new_nums=[]
 print(new_nums)


 # use a for loop which uses the following syntax: for num in nums:
 for num in nums:

  # each time round the loop append num to new_nums
  new_nums.append(num)


 # print the first 10 items of new_nums
 print(new_nums[:10])

 # print the first 10 items of nums
 print(nums[:10])


 #print the length of both lists

 new_nums_len=len(new_nums)
 print(new_nums_len)

 nums_len=len(nums)
 print(nums_len)

create_a_copy()


[]
[53, 1, 15, 100, 74, 97, 81, 15, 70, 1]
[53, 1, 15, 100, 74, 97, 81, 15, 70, 1]
1000
1000




```
# This is formatted as code
```

---
### Exercise 5 - count the occurrence of age in characteristics

Write a function which will use the list method:

`num_items = list_name.count(item)`

to count the number of occurrences of 'age' in the `characteristics` list.  Print the result.

In [78]:
def occurrence_of_age():

# to count the number of occurrences of 'age' in the characteristics list

 num_items=characteristics.count('age')
 print(num_items)

occurrence_of_age()

571


---
### Exercise 6 - sort the nums list

Write a function which will:
*   call the function `get_num_list()` and store the result in a new list called **sort_nums**
*   print the first, and last, 20 items in the `sort_nums` list
*   use the `list_name.sort()` method to sort the `sort_nums` list into ascending order
*   print the first, and last, 20 items again  
*   use the `list_name.sort()` method again to sort the `sort_nums` list into descending order
*   print the first, and last, 20 items again

In [48]:
def  get_num_list():

 #1- called the function get_num_list() and storing the result in a new list called sort_nums
  sort_nums=nums

  # to print the first, and last, 20 items in the sort_nums list
  first_20=sort_nums[:20]
  print(first_20)

  last_20=sort_nums[-20:]
  print(last_20)


  #2- to sort the sort_nums list into ascending order
  sort_nums.sort()

  # to print the first, and last, 20 items again
  first_20=sort_nums[:20]
  print(first_20)

  last_20=sort_nums[-20:]
  print(last_20)


  #3- using the list_name.sort() method again to sort the sort_nums list into descending order


  sort_nums.sort(reverse=True)
  #print(sort_nums)      # I don't need to print reversed sort_nums for now that's why I put a #



  # print the first, and last, 20 items again

  first_20=sort_nums[:20]
  print(first_20)

  last_20=sort_nums[-20:]
  print(last_20)


get_num_list()

[51, 83, 9, 84, 29, 75, 49, 98, 80, 77, 53, 59, 3, 30, 70, 83, 63, 23, 68, 1]
[39, 30, 92, 30, 50, 46, 13, 26, 5, 98, 24, 62, 97, 95, 89, 97, 34, 31, 82, 58]
[1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3]
[99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 100, 100, 100, 100, 100]
[100, 100, 100, 100, 100, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99]
[3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1]


---
### Exercise 7 - get statistics (max(), min(), sum() )

Write a function which will:
*   print the maximum and minimum numbers in the `nums` list  
*   print the sum of the `nums` list
*   calculate and print the average of the `nums` list (using `len()` to help)

In [73]:
def get_statistics(max,min,sum):



#1-to print the maximum numbers in the nums list

 max_list=max(nums)
 print(max_list)


#2-to print the minimum numbers in the nums list

 min_list=min(nums)
 print(min_list)


#3-to calculate and print the average of the nums list

 list_sum=sum(nums)
 average=list_sum/len(nums)
 print(average)


get_statistics(max,min,sum)

100
1
50.194


---
### Exercise 8 - percentage difference

Write a function which will:
*   generate a new list called **ex8_nums** using `get_num_list()`
*   calculate and print the percentage difference between the first number in each list (as a percentage of the number in the nums list) (Hint:  find the difference between the two numbers, divide the difference by the number in `nums` and multiply by 100)
*   calculate and print the percentage difference between the last numbers in each list in the same way
*   calculate and print the percentage difference between the middle numbers in each list in the same way.
*   calculate and print the percentage difference between the sums of each list in the same way

In [54]:
def percentage_difference():

  ex8_nums=get_num_list()



  # percentage difference between the first number in each list

  num1=nums[0]                   # 0 index for the first value
  num2=ex8_nums[0]
  difference=(num2-num1)         # if we put abs( absolute value) before num2-num1 then we will get the value in positive number not in negative.
  percentage=(difference/num1)*100

  print(num1)
  print(num2)
  print(difference)
  print(str(percentage)+"%")



 #percentage difference between the last number in each list

  num1=nums[-1]                   # -1 for last number in the index
  num2=ex8_nums[-1]
  difference=(num2-num1)            # if we put abs( absolute value) before num2-num1 then we will get the value in positive number not in negative.
  percentage=(difference/num1)*100

  print(num1)
  print(num2)
  print(difference)
  print(str(percentage)+"%")




 #percentage difference between the middle number in each list

  middle_index=len(nums)//2
  num1=nums[middle_index]

  num2=ex8_nums[middle_index]
  difference=num2-num1          # if we put abs( absolute value) before num2-num1 then we will get the value in positive number not in negative.
  percentage=(difference/num1)*100

  print(num1)
  print(num2)
  print(difference)
  print(str(percentage)+"%")



 #percentage difference between the sum of each list

  sum1=sum(nums)           # using sum function to get sum
  sum2=sum(ex8_nums)
  difference=sum2-sum1
  percentage=(difference/num1)*100


  print(num1)
  print(num2)
  print(difference)
  print(str(percentage)+"%")

percentage_difference()




53
75
22
41.509433962264154%
6
20
14
233.33333333333334%
12
52
40
333.33333333333337%
12
52
2264
18866.666666666664%


---
### Exercise 9 - characteristic counts

Write a function which will:
*  iterate through the `protected_characteristics` tuple and for each **characteristic**:
*   *   count the number of occurrences of that `characteristic` in the `characteristics` list
*   *   print the `protected_characteristic` and the **count**  

Example expected output:

age 100  
disability 120  
gender reassignment 120  
marriage and civil partnership 111  
pregnancy and maternity 103  
race 106  
religion or belief 95  
sex 110  
sexual orientation 113  

Extra learning:  you can read [here](https://thispointer.com/python-how-to-pad-strings-with-zero-space-or-some-other-character/) how to justify the printed characteristic so that the output is organised into two columns as shown below:  
![tabulated output](https://drive.google.com/uc?id=1CCXfX6K5ZeDefnq7vUsqxCDmqvcfY8Mz)





In [72]:


def characteristic_counts():

 for characteristic in protected_characteristics:
    # to count the number of occurrences of that characteristic in the characteristics list

    count=characteristics.count(characteristic)

    # to print the protected_characteristic and the count
    print(characteristic, count)



characteristic_counts()

age 571
disability 536
gender reassignment 538
marriage and civil partnership 524
pregnancy and maternity 519
race 560
religion or belief 544
sex 587
sexual orientation 575


---
### Exercise 10 - characteristics statistics

Assuming that the `characteristics` list may have been taken from a study of cases that have been taken to court in relation to the Equality Act.  

Write a function which will:

*   find the most common characteristic resulting in court action, from this population
*   print this in a message, e.g. The characteristic with the highest number of court cases is:  *characteristic*
*   print the list of `protected_characteristics`, on one line if possible - see [here](https://www.geeksforgeeks.org/g-fact-25-print-single-multiple-variable-python/)
*   ask the user to enter a characteristic that they would like to see statistics on and use a while loop to continue until the user has entered a valid characteristic
*   print the characteristic, its frequency and the percentage that this frequency is of the whole population.

In [71]:
def most_common_characteristics(characteristics):

  #find the most common characteristic resulting in court action, from this population

  most_common_characteristics=max(characteristics, key=characteristics.count)
  print(most_common_characteristics)

  print("The characteristics with the higher number of court cases is:",most_common_characteristics )

  print(protected_characteristics, end ="")

  # ask for user input
  user_input=input("Enter a characteristics: ")

  # use a while loop to check if the user input is valid or not

  while user_input not in protected_characteristics:
    print("Invalid characteristics. Please enter one of the following: ")  # print error message if user input is not a valid characteristics
    print(protected_characteristics, end ="")

  user_input=input("Enter a characteristics: ")

  #print the characteristic, its frequency and the percentage that this frequency is of the whole population.

  frequency=characteristics.count(user_input)

  percentage=(frequency/len(characteristics))*100

  print("The characteristics",{user_input},"has",{frequency},"court cases.")
  print("This is ", {percentage},"% of the whole population.")



most_common_characteristics(characteristics)

sex
The characteristics with the higher number of court cases is: sex
('age', 'disability', 'gender reassignment', 'marriage and civil partnership', 'pregnancy and maternity', 'race', 'religion or belief', 'sex', 'sexual orientation')Enter a characteristics: age
Enter a characteristics: age
The characteristics {'age'} has {571} court cases
This is  {11.526039563988697} % of the whole population.
