<a target="_blank" href="https://colab.research.google.com/github/lukebarousse/Python_Data_Analytics_Course/blob/main/1_Basics/14_List_Comprehensions.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# List Comprehensions

## Notes

* A way way to create a new list (with shorter syntax) based on the values of an existing list.

Not limited to only `list` comprehension:
- `set` comprehension
- `tuple` comprehension
- `dictionary` comprehension

## Importance

Provide a concise way to create lists. Useful for data manipulation and filtering in pandas.

In [None]:
# Creating a list of numbers from 0 to 9
numbers = [x for x in range(10)]
numbers

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

## Example # 1

We're going to modify our example that we used in our `for` loop. Intsead of having the whole print statement with "Position requires X years of experience". We are just going to print out the experience required. This is a simplified version of our code earlier.

In [None]:
# Minimum experience required for job positions
position_experience_requirements = [1, 2, 3]

# Iterate over each experience requirement in the list of job positions
for x in position_experience_requirements:
    print(x)

1
2
3


Now let's use list comprehension to shorten this.

- The code defines `position_experience_requirements` as a list of integers representing minimum years of experience required for various job positions.
- The for loop goes through each list item in `postion_experience_requirements` and prints out the `requirement`.

In [None]:
# Create a list of job positions
experience = [x for x in position_experience_requirements]

# The result will be a list of job positions
experience

[1, 2, 3]

This is pretty basic. So let's make it a bit more useful. I'm going to add in a variable that stores the user's years of experience.

In [None]:
user_experience = 2
user_experience

2

Now, we are adding an if condition to our list comprehension. This condition checks if the user's experience (`user_experience`) is greater than or equal to each item (`x`) in the `position_experience_requirements` list.

```python
if user_experience >= x
```

It returns only the jobs where the requirement is met or is lower than the user's experience.

In [None]:
# Create a list of job positions for which the user is qualified

qualified_positions= [x for x in position_experience_requirements if user_experience>= x]

qualified_positions

[1, 2]

## Example # 2

This first code block extracts the data we need for this exercise; we'll dive into this later in the course.

For now just understand I'm extracting the list of `job_titles` form our dataset.

In [None]:
from datasets import load_dataset

# Load the dataset
dataset = load_dataset('lukebarousse/data_jobs')
df = dataset['train'].to_pandas()

# Create a list of job titles from the dataset
job_list = df['job_title'].tolist()

# Remove any non-string values from the list
job_list = [job for job in job_list if isinstance(job, str)]

Let's modify our previous `for` loop into a list comp!

In [None]:
# previous for loop
analyst_list = []

for job in job_list:
  if "Data Analyst" in job:
    analyst_list.append(job)

# show first 10 values
analyst_list[:10]

['Technical Data Analyst',
 'Sr. Data Analyst - Full-time / Part-time',
 'Data Analyst',
 'Data Analyst',
 'Data Analyst Junior settore Logistica',
 'Senior Data Analyst - Now Hiring',
 'Health Technology Data Analyst',
 'Data Analyst',
 'Love Excel? Junior Data Analyst for Real Estate',
 'Data Analyst']

However that was 4 lines of code!

With list comprehension we can do it in only 1.

In [None]:
analyst_list = [job for job in job_list if "Data Analyst" in job]

# show first 10 values
analyst_list[:10]

['Technical Data Analyst',
 'Sr. Data Analyst - Full-time / Part-time',
 'Data Analyst',
 'Data Analyst',
 'Data Analyst Junior settore Logistica',
 'Senior Data Analyst - Now Hiring',
 'Health Technology Data Analyst',
 'Data Analyst',
 'Love Excel? Junior Data Analyst for Real Estate',
 'Data Analyst']

In [None]:
print("Job list is:     " , len(job_list), "jobs")
print("Analyst list is: ", len(analyst_list), "jobs")

Job list is:      787685 jobs
Analyst list is:  163124 jobs


In [4]:
multiplication_table = [[i*j for i in range(1,10) if i%4 >1]for j in range(4,9) if j%2==0]
print(multiplication_table)

[[8, 12, 24, 28], [12, 18, 36, 42], [16, 24, 48, 56]]


In [6]:
even_squares = [ x**2 for x in range(5,19) if x%3==1]
print(even_squares)

[49, 100, 169, 256]


In [8]:
nested_list = [[1,2,3],[4,5],[9,7,8]]
flattened= [ item for sublist in nested_list for item in sublist]
print(flattened)

[1, 2, 3, 4, 5, 9, 7, 8]


In [7]:
grid= [ (x,y) for x in range(5,9) for y in range(4,8)]
print(grid)

[(5, 4), (5, 5), (5, 6), (5, 7), (6, 4), (6, 5), (6, 6), (6, 7), (7, 4), (7, 5), (7, 6), (7, 7), (8, 4), (8, 5), (8, 6), (8, 7)]


In [10]:
words = [ 'apple', 'banana' , 'champ','damn','cherry','pop']
word_lengths = { word: len(word) for word in words if len(word)>5}
print(word_lengths)

{'banana': 6, 'cherry': 6}


In [18]:
sentence = "I have a dream that one day this nation will rise up and live out the true meaning of its creed."

word_list= sentence.lower().split()  # first lower  then  split  because list obj does not have lower method
print(word_list)
word_list.count('that')

word_count= {word:sentence.lower().split().count(word) for word in set(sentence.lower().split())}
print(word_count)

['i', 'have', 'a', 'dream', 'that', 'one', 'day', 'this', 'nation', 'will', 'rise', 'up', 'and', 'live', 'out', 'the', 'true', 'meaning', 'of', 'its', 'creed.']
{'i': 1, 'the': 1, 'true': 1, 'live': 1, 'a': 1, 'up': 1, 'meaning': 1, 'one': 1, 'will': 1, 'this': 1, 'have': 1, 'and': 1, 'dream': 1, 'that': 1, 'rise': 1, 'of': 1, 'day': 1, 'creed.': 1, 'nation': 1, 'its': 1, 'out': 1}


In [23]:
data_string = "In 2025, the population of city X is expected to reach 1 million 250k."

digits = [ int(char) for char in data_string if char.isdigit()]
print(digits)
n=10
rest= [6,9,8,5]+[0]*(n-4)
print(rest)

[2, 0, 2, 5, 1, 2, 5, 0]
[6, 9, 8, 5, 0, 0, 0, 0, 0, 0]


In [22]:
people = [
    {'name': 'Alice', 'age': 30},
    {'name': 'Bob', 'age': 25},
    {'name': 'Charlie', 'age': 35},
    {'name': 'David', 'age': 20}
]
younger_than_25 = [person for person in people if person['age']<25]
print(younger_than_25)

[{'name': 'David', 'age': 20}]


In [26]:
n=int(input('till what n '))
fibo = [0,1]+[fibo[i-1]+fibo[i-2] for i in range(2,n)]
print(fibo) # why its wrong ?

till what n 10
[0, 1, 1, 2, 2, 1, 0, 0, 0, 0]


In [36]:
n = int(input('till what n '))
fibo = [0, 1]
for i in range(2, n):
    fibo.append(fibo[i-1] + fibo[i-2])
print(fibo)

till what n 10
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]


In [53]:
n = int(input('till what n '))

# Precompute Fibonacci sequence
fibo = [0, 1]
[fibo.append(fibo[-1] + fibo[-2]) for _ in range(2, n)]  # Extend the list
print(fibo)
# Use list comprehension to slice the result
#fibo = [fibo[i] for i in range(n)]   :/  what is wrong with you
print(fibo)

till what n 20
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]


In [46]:
def fibonacci(n):
    return [fib(i) for i in range(n)]

def fib(k):
    if k == 0:
        return 0
    elif k == 1:
        return 1
    else:
        a, b = 0, 1
        for _ in range(2, k + 1):
            a, b = b, a + b
        return b

n = int(input('till what n '))
print(fibonacci(n))

KeyboardInterrupt: Interrupted by user

In [49]:
sentence = "This is an example sentence."
vowel_counts = {vowel : sentence.lower().count(vowel) for vowel in 'aeiou' if vowel in sentence.lower() }
print(vowel_counts)


{'a': 2, 'e': 5, 'i': 2}
