# Lists

Here's a quick comparison between these 4 container data types:

| Feature          | List                                  | Dictionary                           | Set                                | Tuple                             |
|------------------|---------------------------------------|--------------------------------------|------------------------------------|-----------------------------------|
| Syntax           | `[item1, item2, ...]`                 | `{'key1': value1, 'key2': value2}`   | `{item1, item2, ...}`              | `(item1, item2, ...)` or `item,`  |
| Type of Data     | Sequence                              | Mapping                              | Set                                | Sequence                          |
| Order            | Ordered                               | Unordered                            | Unordered                          | Ordered                           |
| Indexing         | Yes (by index)                        | Yes (by key)                         | No                                 | Yes (by index)                    |
| Duplicate Values | Allowed                               | Values can be duplicated, keys cannot| Not allowed                        | Allowed                           |
| Mutability       | Mutable                               | Mutable                              | Mutable                            | Immutable                         |
| Usage            | For a collection of ordered items     | For key-value pairs                  | For unique items                   | For fixed data                    |

## Notes

- Used to store multiple ordered items in a single variable.
- Created using `[` and `]`.
- We won't be going into everything that we can do in a list.
- Common data types in lists: Integer, Float, String, Boolean, List, Dictionary, Tuple, Set, Object.
- You can include lists within lists.
- Easy to store information.  

## Importance

Versatile for storing sequences of data. Pandas can convert lists into Series or DataFrame objects for analysis.

## Examples

Create a list of job skills that are common to data science roles.


In [3]:
my_list = [1, 'python', [2, 'sql']]

print(my_list)

[1, 'python', [2, 'sql']]


In [5]:
help(list)

Help on class list in module builtins:

class list(object)
 |  list(iterable=(), /)
 |
 |  Built-in mutable sequence.
 |
 |  If no argument is given, the constructor creates a new empty list.
 |  The argument must be an iterable if specified.
 |
 |  Methods defined here:
 |
 |  __add__(self, value, /)
 |      Return self+value.
 |
 |  __contains__(self, key, /)
 |      Return bool(key in self).
 |
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |
 |  __eq__(self, value, /)
 |      Return self==value.
 |
 |  __ge__(self, value, /)
 |      Return self>=value.
 |
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |
 |  __getitem__(self, index, /)
 |      Return self[index].
 |
 |  __gt__(self, value, /)
 |      Return self>value.
 |
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate sign

In [11]:
job_skills = ['sql', 'tableau', 'excel']

# add
job_skills.append('python')

print(job_skills)

['sql', 'tableau', 'excel', 'python']


In [12]:
job_skills = ['sql', 'tableau', 'excel']

# remove the first occurence
job_skills.remove('tableau')

print(job_skills)

['sql', 'excel']


In [15]:
job_skills = ['sql', 'tableau', 'excel']

# Number of items in list
num_skills = len(job_skills)

print(num_skills)

3


In [16]:
job_skills = ['sql', 'tableau', 'excel']

# Get the skill in the index of 2.  indexes in lists are base 0
excel_skill = job_skills[2]

print(excel_skill)

excel


In [18]:
job_skills = ['sql', 'excel', 'python']

# insert at index 2 (3rd item)
job_skills.insert(2,'tableau')

print(job_skills)

['sql', 'excel', 'tableau', 'python']


In [20]:
job_skills = ['sql', 'excel', 'tableau', 'python']

print("Current List: ", job_skills)

# pops the last index unless index is provided
removed_item = job_skills.pop()

print("Removed item: ", removed_item)
print("Updated List: ", job_skills)

Current List:  ['sql', 'excel', 'tableau', 'python']
Removed item:  python
Updated List:  ['sql', 'excel', 'tableau']


## Slicing - Accessing multiple values

Slicing Syntax:
* Syntax: `list[start:end:step]`  
    * `start`: The starting index (inclusive)
    * `end`: The ending index (exclusive)
    * `step`: Steps to take between items

In [41]:
job_skills = ['sql', 'excel', 'python']

# starting index is inclusive but the ending index is exclusive.
# That means you need to specify one number higher than the last index to print all items
print("start index 0 and end index 3 specified: ", job_skills[0:3])

# If you don't specify the indexes and just the : it will print all by default
print("default indexes: ", job_skills[:])

# prints first item with start as default index
print("default start index, end index of 1: ", job_skills[:1])

# prints last item with ending index default
print("start index -1, default end index to get last item in list: ", job_skills[-1:])

start index 0 and end index 3 specified:  ['sql', 'excel', 'python']
default indexes:  ['sql', 'excel', 'python']
default start index, end index of 1:  ['sql']
Python indexes items in reverse with -1 as the last item, then -2 as next to last item, etc.
start index -1, default end index to get last item in list:  ['python']


### Slicing with Steps

In [44]:
luke_skills = ['python', 'bigquery', 'r']
kelly_skills = ['python', 'sql', 'looker']

all_skills = luke_skills + kelly_skills

print(all_skills)

print("Print ever other item:")
print(all_skills[::2])

print("Print last two items:")
print(all_skills[-2:])


['python', 'bigquery', 'r', 'python', 'sql', 'looker']
Print ever other item:
['python', 'r', 'sql']
Print last two items:
['sql', 'looker']


## Unpacking

Assign each value in an iterable to a variable in a single statement

In [48]:
job_skills = ['python', 'excel', 'sql']

# unpacking (assigning variables to each item in list)
skill1, skill2, skill3 = job_skills

print(skill1)
print(skill2)
print(skill3)

python
excel
sql


In [51]:
job_skills = ['python', 'excel', 'sql', 'looker']

# unpacking errors out(not assigning all items in list to a variable)
skill1, skill2, skill3 = job_skills

print(skill1)
print(skill2)
print(skill3)

ValueError: too many values to unpack (expected 3)

In [50]:
job_skills = ['python', 'excel', 'sql', 'looker']

# unpacking * opperator
# assign first item to first varible and assign all other items to the second variable
skill_concerned, *skill_dont_care = job_skills

print(skill_concerned)
print(skill_dont_care)


python
['excel', 'sql', 'looker']


# Problems

## Access Second Job Title (1.8.1) - Problem

In [52]:
job_titles = ['Data Scientist', 'Data Analyst', 'Machine Learning Engineer']

print(job_titles[1])

Data Analyst


## Change Third Job Title (1.8.2) - Problem

In [53]:
job_titles = ['Data Scientist', 'Data Analyst', 'Machine Learning Engineer']

job_titles[2] = 'AI Specialist'

print(job_titles)

['Data Scientist', 'Data Analyst', 'AI Specialist']


## Slice Job Titles List (1.8.3) - Problem

In [54]:
job_titles = ['Data Scientist', 'Data Analyst', 'Machine Learning Engineer', 'Data Engineer']
print(job_titles[:2])

['Data Scientist', 'Data Analyst']


## Append Job Title (1.8.4) - Problem

In [55]:
job_titles = ['Data Scientist', 'Data Analyst', 'Machine Learning Engineer']
job_titles.append('Data Engineer')

print(job_titles)

['Data Scientist', 'Data Analyst', 'Machine Learning Engineer', 'Data Engineer']


##  Insert Job Title (1.8.5) - Problem

In [56]:
job_titles = ['Data Scientist', 'Data Analyst', 'Machine Learning Engineer']
job_titles.insert(1, 'Business Analyst')

print(job_titles)


['Data Scientist', 'Business Analyst', 'Data Analyst', 'Machine Learning Engineer']
