## Data structures    

Data structures are fundamental components in programming that allow you to efficiently **store, organize, and manipulate** data. They provide a way to manage and **work with collections of values** or entities. In Python, there are several commonly used data structures:

Data structures can be broadly classified into two categories: mutable and immutable.

- **Mutable Data Structures**: Mutable data structures, as the name implies, can be modified or changed after they have been defined. This means you can add, remove, or modify elements within these structures. Examples of mutable data structures in Python include lists and dictionaries. For example, you can add or remove items from a list, or update the values associated with keys in a dictionary.

- **Immutable Data Structures**: In contrast, immutable data structures cannot be changed once they have been defined. This means that once you create an immutable data structure, its elements remain fixed and cannot be modified. Examples of immutable data structures in Python include strings and tuples. For instance, you cannot change a character in a string or modify an element in a tuple once they are created.

In this and the following lesson, we will talk about the following data structures:  
- **Lists**: Lists are ordered collections of items enclosed in square brackets [ ]. They can store elements of different data types and allow for indexing, appending, removing, and modifying elements. (MUTABLE)     
- **Dictionaries**: Dictionaries are key-value pairs enclosed in curly braces { }. Each element in a dictionary consists of a key and its associated value. Dictionaries provide fast lookup and retrieval of values based on their keys and are useful for organizing and accessing data using meaningful labels. (MUTABLE)  
       
There are a couple of data structures that we will cover later in the bootcamp. Although we won't dive into them in detail just yet, here's a brief definition of these data structures to give you an overview:
   
- **Sets**: Sets are unordered collections of unique elements enclosed in curly braces { }. They do not allow duplicate values and provide operations like union, intersection, and difference. Sets are useful for membership testing and eliminating duplicates from a sequence.(MUTABLE)   
- **Tuples**: Tuples are similar to lists but are immutable, meaning their elements cannot be modified once created. They are enclosed in parentheses ( ) and commonly used to represent fixed collections of related values. (IMMUTABLE)

## Data Types vs Data Structures

Data types define the nature of individual values and determine the operations you can perform on them, while data structures provide containers or organizational mechanisms to store and work with collections of data. Data structures utilize different data types to hold and manage data efficiently.

An analogy to understand the difference between data types and data structures is to compare them to atoms and molecules. In this analogy, a data type can be seen as an atom, which is the basic building block of matter. It represents a single unit of data, such as an integer or a string.

On the other hand, data structures can be likened to molecules. They are formed by combining multiple atoms (data types) together in a specific arrangement to create a more complex and meaningful structure. 

## Lists

A list is a versatile and commonly used data structure that allows you to store and manipulate collections of elements. It is a **mutable**, **ordered** sequence that can hold items of different data types, such as integers, strings, or even other lists. Lists are enclosed in square brackets [] and elements within the list are separated by commas.

### Creating a list

To create a list, you can simply assign a sequence of elements to a variable using the square brackets notation.

In [None]:
fruits = ["apple", "banana", "orange"]

In [None]:
fruits

To create an empty list, you can simply assign square brackets.

In [None]:
vegetables = []

In [None]:
vegetables

In [None]:
# To check the type of this variable we can use type() function
type(fruits)

In [None]:
# Lists can hold elements of different type
example1 = [1, 2, 3, 4] # Just integers
example2 = ['a', 'b', 'c'] # Just characters (strings)
example3 = [1 , 'a', True] # Integer, string and boolean

In [None]:
print(type(example1)) #It has integers, but type is list
print(type(example2)) #It has strings, but type is list
print(type(example3)) #It has many types, but type is list

### Indexing and accesing elements

In Python, indexing allows you to access individual elements within a list (and any other ordered structure, such as strings and tuples). 

Each element in a list has a unique index value that represents its position within the list. The index starts from 0 for the first element and increments by 1 for each subsequent element.

Syntax: 
```python
list_name[index]
```


<img src="https://education-team-2020.s3-eu-west-1.amazonaws.com/data-analytics/prework/unit1/zero_based_indexing_python.png" width=400 height=400>

Index values can also be negative, indicating positions from the end of the list. In this case, -1 refers to the last element, -2 refers to the second last element, and so on.

You can access individual elements of a list using their index values. 

In [None]:
x = ["M","o","n","t","y"," ","P","y","t","h","o","n"] # We define a list of strings

In [None]:
x[0] #Note that the index starts in 0

In [None]:
x[1]

In [None]:
print(x[0], x[1], x[2], x[3], x[4], x[5], x[6], x[7], x[8], x[9], x[10], x[11])

While the overall data structure is a list, the type of each individual element corresponds to the data type assigned to that element.

In [None]:
print(type(x))
print(type(x[0]))

### Modifying List Elements

Lists are mutable, meaning you can modify their elements after creation. You can assign new values to specific elements using their index.

In [None]:
fruits[1] = "grape"
print(fruits)  # Output: ["apple", "grape", "orange"]

### List Operators

Lists support various operations, such as concatenation (+) to combine two lists or repetition (*) to repeat a list.

In [None]:
numbers = [1, 2, 3]
combined = fruits + numbers
print(combined)  # Output: ["apple", "grape", "orange", 1, 2, 3]

repeated = fruits * 3
print(repeated)  # Output: ["apple", "grape", "orange", "apple", "grape", "orange", "apple", "grape", "orange"]

### List Methods and Functions

A pre-defined function in python that can be used on lists is `len()` which returns the length of the list.

In [None]:
# Lets add one more fruit to the list fruits
fruits = fruits+["ananas"] # Remember, since fruits is a list, we need to append another list
fruits

Note: we need to append another list `fruits+["ananas"]`, we cannot append just the string `"ananas"`. 
Try yourself to do so, read the error, and try to understand why.

In [None]:
# Len gives the number of elements in the list. So we should get 4, since we have 4 elements
len(fruits) 

**Note:**
Since python has zero based indexing, the last element in the list has an index of 'length-1'

In [None]:
fruits[3] # The last element of the list is in position 3 (we have 4 elements but we starting count at 0)

In [None]:
len(fruits)- 1 # So we can also get the index of the last element by doing length -1 (4-1=3)

In [None]:
# A better way to get the last element would be through length - 1
fruits[len(fruits)- 1]

Python provides built-in methods to perform common operations on lists. 

Some commonly used methods include **append(), insert(), remove(), and sort().**

In [None]:
fruits.append("kiwi")  # Adds "kiwi" to the end of the list
fruits.insert(1, "mango")  # Inserts "mango" at index 1
fruits.remove("apple")  # Removes "apple" from the list
fruits.sort()  # Sorts the list in ascending order

print(fruits)  # Output: ["grape", "kiwi", "mango", "orange"]

In [None]:
fruits.index("mango") # Return first index of value given between ()

💡 Check for understanding

Look at the error we get if we execute the following line again. Why do you think it is?

In [None]:
fruits.remove("apple")  # Removes "apple" from the list

Note: You may have noticed that we talk about *functions* and *methods*. 

During the bootcamp, we will delve deeper into the distinction between functions and methods. 

However, for now, you can observe a simple way to differentiate them. Functions, like `len()`, are invoked independently and do not require anything before them. On the other hand, methods, such as `append()`, are associated with a specific variable or object, denoted by the object preceding the method. 

For instance, we use `len(x)` to call the function, but we employ `x.append("hi")` to invoke the method, where x represents the variable or object.

### Slicing

In *Indexing and accesing elements* we saw that we can access elements from a list, individually, by doing:
```python
my_list[index]
```

When we want to access more than one element in the list, we use what is called *slicing*.

Slicing is a powerful technique in Python that allows you to extract a portion of a sequence, such as a string, list, or tuple, by specifying a range of indices.

Note: A negative index can be used to count from the end of the sequence. -1 refers to the last element, -2 refers to the second last, and so on.

When using slicing in Python, the simplified syntax is `sequence[start:end]`, where `start` represents the index of the first element to include in the slice, and `end` represents the index of the first element to exclude from the slice.

There are also special cases to consider when using slicing:

- `x[:]`: This returns everything in the list, as it includes all elements from the start to the end.

- `x[:stop]`: If you omit start, the slice will start from the beginning of the sequence (index 0). This returns items from the beginning of the list up to, but not including, the element at the index specified by stop.

- `x[start:]`: If you omit end, the slice will include elements until the end of the sequence. This returns items from the index specified by start until the end of the list, including the element at the start index.



In [None]:
x = ['a', 'b', 'c', 'd', 'e', 'f', 'g']  

In [None]:
x[:] # Returns the whole list

In [None]:
x[:3] # Returns a new list which includes elements at indices 0,1,2

In [None]:
x[3:] # Returns a new list which includes elements at indices 3,4,5,6

In [None]:
x[3:5] # Returns a new list which includes elements at indices 3,4

There is one more optional parameter in slicing, that is step:
`sequence[start:end:step]`

- step (optional) represents the increment between elements in the slice. By default, the step value is 1, meaning that consecutive elements are included in the slice. However, you can customize this behavior by providing a different value for the step.

Positive step value: When the step value is a positive number greater than 1, the slice skips elements based on the step value. 



In [None]:
x[::2] # Returns a new list which includes every second element starting from index 0.

In [None]:
x[1::3] # Returns a new list, which starts at index 1, and includes elements with a step of 3, i.e. at indices 1, 4 

In [None]:
x[2:5:2] # Returns a new list, which includes elements at indices 2, 4 with a step of 2.

### Exercises 

1. Given the list:
    lst = [1, 2, 34, 5, 3, 12, 9, 8, 67, 89, 98, 90, 39, 21, 45, 46, 23, 13]
    
Answer the following questions:
    
- How many elements are in the list?
  print(len(lst)) returns 18
- Using indexes, find out which is the first element in the list?
  print(lst[0]) returns 1
- Using indexes, find out which is the last element in the list?
  print(lst[-1]) returns 13
- What is the index of element `90` in the list?
  print(lst.index(90)) returns 11 
- Which are the first 8 elements in the list?
  print(lst[:8]) returns [1, 2, 34, 5, 3, 12, 9, 8]
- Append elements 100 and 110 to the list.
  lst.append(100) and 
lst.append(110 add those elements to the list)
- Sort the elements in the list.
  lst.sort() and 
print(lst return [1, 2, 3, 5, 8, 9, 12, 13, 21, 23, 34, 39, 45, 46, 67, 89, 90, 98, 100, 110 when added 100 and 110])

2. Write a program that searches for a specific element in a given list and returns its index. If the element is not found, print a message indicating that it is not present in the list.

In [6]:
names = ["Alice", "Bob", "Charlie", "Dave"] 
search_name = "Bob" # Element to search in "names" list

try:
    index = names.index(search_name)
    print(f"The index of the element {search_name} is {index}.")
except ValueError:
    print(f"The element {search_name} is not present in the list.")

The index of the element Bob is 1.


3. Write a program that calculates the average of the elements in a given list. The list can contain both positive and negative numbers.

*Hint: you can either look for a Python function that calculates the average or look for a Python function that sums all elements in a list and divide by the length of the list.*

In [10]:
numbers = [2, 4, 6, 8, 10]

total = sum(numbers)

count = len(numbers)

average = total / count

print(f"the average of the elements in the list is {average}.")

the average of the elements in the list is 6.0.
