# Working with list of data

When we're coding, often we need to work with a list of data, for example let's take a look another cool machine learning function below 👀.

In [1]:
%pip install transformers
from transformers import pipeline

def classify_text(text, candidate_labels):
    # Initialize the pipeline for zero-shot classification
    classifier = pipeline(
        "zero-shot-classification",
        model="facebook/bart-large-mnli",
    )

    # Classifying the text
    result = classifier(text, candidate_labels)

    # Returning the label with the highest score
    return result['labels'][0]

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
classify_text("Tomorrow seems will be raining",["weather", "finance", "tech"] )

config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

'weather'

Whoa! New datatype! What is that?

It's called a list. A list is a collection of data, and it's denoted by square brackets [ ]. A list might contains several data with several data types, so a list can be a list of numbers, a list of strings, a list of combination between numbers and strings, or even a list of lists!

For now please have fun with above code, basically the first argument is the text that you want to classify, and the second argument is the list of possible classification.

Try:
```python
"Jakarta", ["city", "country", "building", "continent" ]
"Andi", ["animal", "item", "technology", "person's name" ]
"Hello everyone!",["greeting", "farewell" ]
"Algebra", ["math", "physics", "chemistry", "biology", "geography", "history", "language", "art", "music", "sport", "programming" ]
```

Note that the list can be varied in length, it can have 1 element, 2 elements, 3 elements, or even 100 elements, or even 0 element! (empty list)

# I want to process these texts!

Imagine if you have hundreds of text that you want to know the classification like below:

- The service at the restaurant was really impressive
- What is the status of my order number #1234?
- I have a proposal for a new feature in your app
- My package arrived late and the item was damaged
- Your team is doing an excellent job
- Could you help clarify the specifications of this product?
- I'm extremely dissatisfied with the customer service
- Have you thought about offering more plant-based options on your menu?
- I really appreciate the speedy response from your customer service team
- I enjoy using your application, great work

And we want to classify above list of text into several categories like below:

> opinion, complaint, query, suggestion, appreciation

Processing list of data like this is a really common task, let's try to see below the basic of how to process this kind of data:

In [3]:
texts = [
    "The service at the restaurant was really impressive",
    "What is the status of my order number #1234?",
    "I have a proposal for a new feature in your app",
    "My package arrived late and the item was damaged",
    "Your team is doing an excellent job",
    "Could you help clarify the specifications of this product?",
    "I'm extremely dissatisfied with the customer service",
    "Have you thought about offering more plant-based options on your menu?",
    "I really appreciate the speedy response from your customer service team",
    "I enjoy using your application, great work"
]
candidate_labels = ["opinion", "complaint", "query", "suggestion", "appreciation"]

for text in texts:
    # Classify the text
    label = classify_text(text, candidate_labels)

    # Print the text and its corresponding label
    print("Text: " + text+ ", Label: " + label)

Text: The service at the restaurant was really impressive, Label: appreciation
Text: What is the status of my order number #1234?, Label: query
Text: I have a proposal for a new feature in your app, Label: suggestion
Text: My package arrived late and the item was damaged, Label: complaint
Text: Your team is doing an excellent job, Label: appreciation
Text: Could you help clarify the specifications of this product?, Label: query
Text: I'm extremely dissatisfied with the customer service, Label: complaint
Text: Have you thought about offering more plant-based options on your menu?, Label: suggestion
Text: I really appreciate the speedy response from your customer service team, Label: appreciation
Text: I enjoy using your application, great work, Label: appreciation


As you can see above, we can use what's called a **for loop** to process a list of data. A for loop is a way to repeat a task for every element in a list. Basically every single element in the list will be assigned to a variable, for above example it's `text`, and then we process them in the loop. Like function, for loop is indent-based, so the code that we want to repeat should be indented inside the loop.

# Challenge!

What?! Challenge already?!

Because `for loop` can be hard to master, before we dive deeper into the topic, let's try to do several challenges first!

## Challenge 1

Above we have a list of text, now let's try to do a for loop with a list of numbers!

Using below list, create a for loop that will print the number multiplied by 2.

In [4]:
number_list = [
    8,
    9,
    34,
    56,
]

for number in number_list:
    # Change below code to fulfill the requirements
    number = number * 2
    print(number)

16
18
68
112


When you are done with the above challenge, then:
1. Input your student_id and name in the box below
2. Run the code block by pressing the play button.

In [5]:
!pip install rggrader

from rggrader import submit

# @title #### Student Identity
student_id = "REA3X5EN" # @param {type:"string"}
name = "Steven Adi Santoso" # @param {type:"string"}

# Submit Method
assignment_id = "006_working_with_multiple_data"
question_id = "01_for_loop_1"
submit(student_id, name, assignment_id, str(number), question_id)




[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


'Assignment successfully submitted'

# Challenge 2

One of function that we can use to process text is `len()`, it will return the length of a string. For example:

In [None]:
print(len("Hi there"))
print(len("Thanks for your help"))
print(len("I really appreciate your help on this task"))

8
20
42


Now with that in mind, let's try to create a for loop that will print the length of each string in the list below:

In [6]:
text_data = [
    "Hi, how are you?",
    "I'm doing great. How about you?",
    "I'm also great!",
]

# Write your code here

for text in text_data:
    print(len(text))

# Expected printed data:
# 16
# 31
# 15

16
31
15


When you are done with the above challenge, then:

1. Change the text "my result" below with the text "done"
2. Run the code block by pressing the play button.

In [8]:
# Submit Method
assignment_id = "006_working_with_multiple_data"
question_id = "02_for_loop_2"

result = "done"

submit(student_id, name, assignment_id, result, question_id)

'Assignment successfully submitted'

# Challenge 3

One string method that we can use to process text is `split()`, it will split a string into a list of string given a character. For example:

In [None]:
"Imam,Andi,Budi,Chandra".split(",")

['Imam', 'Andi', 'Budi', 'Chandra']

So for above word we split each word and feed them into a list of string using comma as the separator.

Now split below string, and then get the length of each word using for loop!

In [9]:
names = "Fajar,Levina,Putri,Andi,Budi,Chandra"

# Write your code here

for name in names.split(","):
    print(len(name))

# Expected printed data:
# 5
# 6
# 5
# 4
# 4
# 7

5
6
5
4
4
7


When you are done with the above challenge, then:

1. Change the text "my result" below with the text "done"
2. Run the code block by pressing the play button.

In [11]:
# Submit Method
assignment_id = "006_working_with_multiple_data"
question_id = "03_for_loop_3"

result = "done"

submit(student_id, name, assignment_id, result, question_id)

'Assignment successfully submitted'