# Analyzing a Multiline String and Generating the Unique Word Count

This notebook walks through the following steps:

1. Assign a multiline text to a Python variable.
2. Determine its type and length.
3. Remove newlines and punctuation.
4. Split the text into words.
5. Extract unique words.
6. Count word occurrences.
7. Display the top 25 most frequent words (case-insensitive).

## 1. Assign the Multiline Text

In [12]:
multiline_text = """Human evolution is the lengthy process of change by which people originated from apelike ancestors. Scientific evidence shows that the physical and behavioral traits shared by all people originated from apelike ancestors and evolved over a period of approximately six million years.
One of the earliest defining human traits, bipedalism -- the ability to walk on two legs -- evolved over 4 million years ago. Other important human characteristics -- such as a large and complex brain, the ability to make and use tools, and the capacity for language -- developed more recently. Many advanced traits -- including complex symbolic expression, art, and elaborate cultural diversity -- emerged mainly during the past 100,000 years.
Humans are primates. Physical and genetic similarities show that the modern human species, ÿHomo sapiens, has a very close relationship to another group of primate species, the apes. Humans and the great apes (large apes) of Africa -- chimpanzees (including bonobos, or so-called ?pygmy chimpanzees?) and gorillas -- share a common ancestor that lived between 8 and 6 million years ago. Humans first evolved in Africa, and much of human evolution occurred on that continent. The fossils of early humans who lived between 6 and 2 million years ago come entirely from Africa.
Most scientists currently recognize some 15 to 20 different species of early humans. Scientists do not all agree, however, about how these species are related or which ones simply died out. Many early human species -- certainly the majority of them ? left no living descendants. Scientists also debate over how to identify and classify particular species of early humans, and about what factors influenced the evolution and extinction of each species.
Early humans first migrated out of Africa into Asia probably between 2 million and 1.8 million years ago. They entered Europe somewhat later, between 1.5 million and 1 million years. Species of modern humans populated many parts of the world much later. For instance, people first came to Australia probably within the past 60,000 years and to the Americas within the past 30,000 years or so. The beginnings of agriculture and the rise of the first civilizations occurred within the past 12,000 years."""


## 2. Type and Length of the Text

In [13]:
print('Type:', type(multiline_text))
print('Length:', len(multiline_text))

Type: <class 'str'>
Length: 2255


## 3. Remove Newlines and Punctuation

In [14]:

cleaned_text = multiline_text.replace('\n', ' ').replace('\r', ' ')


import string
for symbol in string.punctuation:
    cleaned_text = cleaned_text.replace(symbol, '')


print(cleaned_text)


Human evolution is the lengthy process of change by which people originated from apelike ancestors Scientific evidence shows that the physical and behavioral traits shared by all people originated from apelike ancestors and evolved over a period of approximately six million years One of the earliest defining human traits bipedalism  the ability to walk on two legs  evolved over 4 million years ago Other important human characteristics  such as a large and complex brain the ability to make and use tools and the capacity for language  developed more recently Many advanced traits  including complex symbolic expression art and elaborate cultural diversity  emerged mainly during the past 100000 years Humans are primates Physical and genetic similarities show that the modern human species ÿHomo sapiens has a very close relationship to another group of primate species the apes Humans and the great apes large apes of Africa  chimpanzees including bonobos or socalled pygmy chimpanzees and goril

## 4. Split into Words

In [16]:
words = cleaned_text.split()
print('Total words:', len(words))

Total words: 352


## 5. Unique Words

In [17]:
unique_words = list(set(words))
print('Unique words count:', len(unique_words))

Unique words count: 193


## 6. Count Word Occurrences (Case-Insensitive)

In [18]:
words_lower = [w.lower() for w in words]

unique_words_lower = list(set(words_lower))


word_counts = { word: words_lower.count(word) for word in unique_words_lower }

print('Word counts (case‑insensitive):')
for word, cnt in word_counts.items():
    print(f"{word}: {cnt}")


Word counts (case‑insensitive):
such: 1
6: 2
period: 1
important: 1
other: 1
asia: 1
agriculture: 1
8: 1
sapiens: 1
over: 3
defining: 1
legs: 1
how: 2
great: 1
art: 1
came: 1
died: 1
walk: 1
for: 2
primates: 1
first: 4
related: 1
18: 1
ones: 1
traits: 3
1: 1
close: 1
apes: 3
early: 5
60000: 1
symbolic: 1
out: 2
of: 16
later: 2
from: 3
modern: 2
these: 1
so: 1
species: 8
majority: 1
behavioral: 1
socalled: 1
primate: 1
ago: 4
lived: 2
shared: 1
scientific: 1
on: 2
elaborate: 1
descendants: 1
factors: 1
entered: 1
australia: 1
by: 2
most: 1
also: 1
europe: 1
them: 1
parts: 1
within: 3
20: 1
including: 2
capacity: 1
migrated: 1
identify: 1
is: 1
they: 1
no: 1
classify: 1
years: 10
probably: 2
come: 1
influenced: 1
mainly: 1
debate: 1
group: 1
instance: 1
make: 1
show: 1
recognize: 1
world: 1
populated: 1
evolution: 3
americas: 1
4: 1
left: 1
developed: 1
evidence: 1
diversity: 1
as: 1
entirely: 1
to: 7
in: 1
humans: 8
much: 2
chimpanzees: 2
recently: 1
earliest: 1
fossils: 1
particular: 1

## 7. Top 25 Most Frequent Words

In [19]:
sorted_word_counts = sorted(word_counts.items(), key=lambda x: x[1], reverse=True)


top_25 = sorted_word_counts[:25]

for word, count in top_25:
    print(f"{word}: {count}")

the: 21
and: 19
of: 16
years: 10
species: 8
humans: 8
million: 8
to: 7
human: 6
early: 5
first: 4
ago: 4
africa: 4
a: 4
between: 4
past: 4
that: 4
over: 3
traits: 3
apes: 3
from: 3
within: 3
evolution: 3
or: 3
scientists: 3
