<a href="https://colab.research.google.com/github/sammywenthikingwithshroomtrolls/Python_for_Data_Analysis/blob/main/Lists.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lists



Here's a quick comparison between these 4 container data types:

| Feature          | List                                  | Dictionary                           | Set                                | Tuple                             |
|------------------|---------------------------------------|--------------------------------------|------------------------------------|-----------------------------------|
| Syntax           | `[item1, item2, ...]`                 | `{'key1': value1, 'key2': value2}`   | `{item1, item2, ...}`              | `(item1, item2, ...)` or `item,`  |
| Type of Data     | Sequence                              | Mapping                              | Set                                | Sequence                          |
| Order            | Ordered                               | Unordered                            | Unordered                          | Ordered                           |
| Indexing         | Yes (by index)                        | Yes (by key)                         | No                                 | Yes (by index)                    |
| Duplicate Values | Allowed                               | Values can be duplicated, keys cannot| Not allowed                        | Allowed                           |
| Mutability       | Mutable                               | Mutable                              | Mutable                            | Immutable                         |
| Usage            | For a collection of ordered items     | For key-value pairs                  | For unique items                   | For fixed data                    |

## Notes

- Used to store multiple ordered items in a single variable.
- Created using `[` and `]`.
- We won't be going into everything that we can do in a list.
- Common data types in lists: Integer, Float, String, Boolean, List, Dictionary, Tuple, Set, Object.
- You can include lists within lists.
- Easy to store information.  

## Importance

Versatile for storing sequences of data. Pandas can convert lists into Series or DataFrame objects for analysis.

## Examples

Create a list of job skills that are common to data science roles.

In [1]:
# Define a list of data science jobs
job_skills = ['sql','tableau','excel']
job_skills

['sql', 'tableau', 'excel']

### Indexing

What if we want to get a specific item in a list? We'd use list indexing.

Lists are indexed, which means each item has a numerical position, so you can access it by referring to the index number.

**Note: The first item has index 0.**

[Visual Example](https://drive.google.com/file/d/1VGn2YJVhhcnJFPz98QUcFgU8OeZA1zKK/view?usp=drive_link)

For this example if want to get 'tableau' in this list we would use the index `1`.

In [2]:
# Get a specific item in the list
job_skills[1]

'tableau'

### Change Value

To change the value of a specific item, refer to its index number. Below we will change the 'tableau' skill (at index `1`) to 'bigquery'.

In [3]:
# Change the value of an item
job_skills[1] ='bigquery'

job_skills

['sql', 'bigquery', 'excel']

In [7]:
job_skills.insert(1,'Python')

In [8]:
job_skills

['sql', 'Python', 'bigquery', 'excel']

In [12]:
job_skills.pop(1)

'Python'

### Methods

* **Methods** are functions that belong to an object
    * We learned a little about functions before but a reminder: it's a block of code designed to do a specific task.
    * In a bit we'll be creating our own functions. But now we'll just be using functions given to us.
* But all you need to know right now is that **methods** have this notation: `object.method()`


### Append()

If you need to add an item at the end of a list you can use `append()` method. Below we'll add in 'looker' as a skill at the end of our list.

We'll get into methods more later.

In [38]:
## Add a job to the list
job_skills.append('looker')

job_skills

['sql', 'bigquery', 'python', 'excel', 'looker', 'looker']

### Length()

If you want to see how many items are in a list use the `len()` function.

In [39]:
len(job_skills)

6

### Insert

To insert a list item at a specified place (index), use `insert`. Here we are inserting the 'python' skill after 'bigquery'. So we will insert it at the index of `2`.

🪲 **Debugging**

**This is an intentional mistake**

This is used to demonstrate debugging.

Error: Incorrect syntax for `insert`. Forgot the comma.

```python
job_skills.insert(2'python')
```

Steps to Debug:

1. Look at the actual error, can you tell what the problem is?
2. If not, then look it up:
  1. Use a chatbot like ChatGPT or Claude
  2. Look it up using Google

In [36]:
# Insert an item into the list
job_skills.insert(2'python')

job_skills

SyntaxError: invalid syntax. Perhaps you forgot a comma? (<ipython-input-36-c7d8fbf43ed4>, line 2)

In [35]:
# This is correct code
#Insert an item into the list
job_skills.insert(2,'python')

job_skills

['sql', 'bigquery', 'python', 'excel', 'looker']

### Remove()

To remove a specific item use the `remove()` method. Let's remove the 'looker' skill.

In [40]:
# Remove an item from the list
job_skills.remove('looker')

job_skills

['sql', 'bigquery', 'python', 'excel', 'looker']

### Join Lists

There are a few ways to join (or concatenate) two or more lists.

1. Concatenate using the `+` operator.


In [41]:
# Concatenate
skills1 = ['SQL', 'Tableau']
skills2 = ['Excel']

skills3 = skills1 + skills2
skills3

['SQL', 'Tableau', 'Excel']

2. Appending all items from the first list to the second using `.append()`. Here we're appending all items from `skills3` to `skills4` list. *Don't worry we'll be going into `for` loops later but here's a preview of it.*


In [42]:
# Append
skills4 = ['Python', 'Power BI']

# Append all items from skills3 to skills4 list
for x in skills3:
    skills4.append(x)

skills4

['Python', 'Power BI', 'SQL', 'Tableau', 'Excel']

3. Use `extend()` method to add elements from one list to another list. We are adding elements from the `skills4` list to the `skills5` list.



In [43]:
# Extend
skills5 = ['Statistics', 'Machine Learning']

skills5.extend(skills4)

skills5

['Statistics',
 'Machine Learning',
 'Python',
 'Power BI',
 'SQL',
 'Tableau',
 'Excel']

### Join()

The `.join` method is used for concatenating a sequence of strings together with a specified separator. You need to add a specific separator (e.g. `', '`) between each element during the concatenation process. This is used to format output for readability.

You can use this for lists, tuples and more. But in our case we'll mostly be using it for either lists or tuples. For lists specifically we are combining multiple strings from a list.

In [20]:
skills = ['Python', 'SQL', 'Excel']

In [21]:
# Use concatenate print formatting
print('I have these skills: ' +  ', '.join(skills))

I have these skills: Python, SQL, Excel


### Slicing Lists

Slicing Syntax:
* Syntax: `list[start:end:step]`  
    * `start`: The starting index (inclusive)
    * `end`: The ending index (exclusive)
    * `step`: Steps to take between items

In [22]:
skills = ['Python', 'SQL', 'Excel']

# Extract the first two items
first_two = skills[0:2]
first_two

['Python', 'SQL']

In [23]:
job_skills[0:3]

['sql']

In [24]:
job_skills[::3]

['sql']

In [25]:
job_skills[::1]

['sql']

In [28]:
lukes_skill=['python','java','r']
kellys_skill=['sql','tableau','excel']

all_skill=lukes_skill+kellys_skill
all_skill

['python', 'java', 'r', 'sql', 'tableau', 'excel']

In [29]:
all_skill[::2]

['python', 'r', 'tableau']

In [35]:
all_skill[::3]

['python', 'sql']

In [36]:
all_skill[5:]

['excel']

`start` has a default value of `0` and `stop` has a default value of the last value in the list.

Therefore: you can omit either when one of these values.

In [47]:
full_list = skills[:]
full_list

['Python', 'SQL', 'Excel']

In [48]:
also_first_two = skills[:2]
also_first_two

['Python', 'SQL']

In [49]:
last_two = skills[1:]
last_two

['SQL', 'Excel']

In [50]:
last_one = skills[-3:]
last_one

['Python', 'SQL', 'Excel']



For `step` the default value is `1`, but if we want to change it up:\

In [51]:
skills = ['Python', 'SQL', 'Excel', 'R', 'Java']

# Extract every second item starting from the first
every_second = skills[0::2]
every_second

['Python', 'Excel', 'Java']

### Unpack List

To unpack a list (unpacking is when you assign each value in a list to a variable in a single statement) you can assign the list elements to variables directly.

In [38]:
job_skill=['python','sql','excel']

In [43]:
# Unpacking the list
skill1, skill2, skill3 = job_skill

# Printing the unpacked variables
print(skill1)
print(skill2)
print(skill3)

python
sql
excel


### Extend Unpack List

To extend unpack a list, is when you assign a subset of elements to a variable as a list.

In [45]:
skill_concerned, *skill_dont_care=job_skill

In [46]:
print(skill_concerned)
print(skill_dont_care)

python
['sql', 'excel']


In [53]:
# Unpacking the sql skills together and then unpacking the rest of the skills
*sql_skills, skill3, skill4 = job_skills

print(sql_skills)  # List of the SQL skills
print(skill3)
print(skill4)

['sql', 'bigquery', 'python']
excel
looker


In [54]:
job_skills='python','excell','sql','looker'

In [55]:
skill1, skill2, skill3=job_skills

ValueError: too many values to unpack (expected 3)

This means:
“Take 3 things from job_skills and put them into 3 variables.”

But Python got more or less than 3, so it crashed.

In [56]:
skill1, skill2, *others = job_skills


In [61]:
print(skill1)
print(skill2)
print(skill3)

Python
SQL
JavaScript


In [65]:
job_skills = ["Python", "SQL", "JavaScript"]


In [66]:
skill1, skill2, skill3 = job_skills

a, b = list → works only if list has 2 items

a, b, *c = list → a and b get the first two, c gets the rest

In [62]:
skill_concerned,skill_dont_care=job_skills

ValueError: too many values to unpack (expected 2)

In [67]:
skill_concerned,*skill_dont_care=job_skills

In [68]:
print(skill_concerned)
print(skill_dont_care)

Python
['SQL', 'JavaScript']
