# Numbers and Strings

In Python, like in any other programming language and in general in life, we _do things_ with _stuff_. Most of this course will focus on the things we do and how to do them. But today we will talk about the stuff.

Almost everything in Python is an object. We will deal with numbers, text, lists, dictionaries... all of those things are different types of Python objects.

The type of an object defines what can you do with it, and how will it behave. You can sum two numbers and they will behave just as mathematics define that sums work:

In [8]:
2 + 2

4

But if you try to sum two strings, he behaviour will be different:

In [9]:
"2" + "2"

'22'

Any object in Python can be assigned to a variable with the `=` operator. Variables are just names for objects —you can visualize them as tags that allow us to identify or call our objects.

In [3]:
result = 4 + 3

At any point, we can retrieve whatever we stored in a variable that was previously defined:

In [4]:
print(result)

7


If we want to find out what type of object does a variable, hold, we can ask it with the function `type()`, like this:

In [5]:
type(result)

int

**Exercise:** Sum `100` plus `3.4` and assign it to the variable `addition`. Print the type of that variable.

In [48]:
sum([100,int(3.4)])
100+int(3.4)

103

## Booleans

The values `True` and `False` are special in Python. They are called "booleans".

In [19]:
type(True)

bool

In [20]:
type(False)

bool

If you try to use mathematical operations with booleans, will see how `True` equates to `1` and `False` equates to `0`


In [12]:
True + True

2

In [13]:
False + False

0

## Integers

Integers are whole numbers (without fractions or decimal points) that can be negative or positive.

In [15]:
type(-8)

int

You can use math operators with integers:

- `+` Addition
- `-` Substraction
- `*` Multiplication
- `/` Folating-point division
- `//` Truncating division
- `%` Modulus
- `**` Exponentiation

**Exercise**: assign the number corresponding to the present year to the variable `year` and your age to the variable `age`. Operate with both variables to create a new variable called `birth_year`.

In [55]:
year = 2021
age = 38
birth_year = year-age
print(birth_year)


1983


You can update the value of a variable like this:

In [50]:
age = 30
age = age + 1
print(age)

31


Or you can combine both arithmetic operators:

In [19]:
age = 30
age += 1
print(age)

31


**Exercise**: The same holds for other operators. Create a variable called `price` with the value `100`. Then update its value in a single line by multiplying it by `2`.

In [56]:
price = 100
price_update = 100 * 2
print(price_update)

200


You can convert stuff that's not an integer to an integer with the function `int`:

In [21]:
int(3.14)

3

In [22]:
int("4")

4

If you try to convert to an integer something that cannot be interpreted as such, you will get an *exception*:

In [23]:
int("Hey hey hey")

ValueError: invalid literal for int() with base 10: 'Hey hey hey'

## Floats

Those are numbers with decimal points

In [25]:
3.

3.0

They can include the letter e for an exponent:

In [26]:
≠

300.0

The function `float()` can be used to convert integers to floats:

In [27]:
float(24)

24.0

**Exercise**: 

- Using Python, compute how many seconds are there in a year. Store that number in a variable called `year_seconds`.

- Multiply that number by your age and store it in a new variable called `life_seconds`

- Find out how many seconds are there in the time you'll spend coursing this bootcamp. Store it into a variable called `bootcamp`.

- Find out what percentage of your life will you spend coursing this bootcamp.


In [61]:
year_seconds = 24 * 60 * 60 * 365
print(year_seconds)

31536000


In [62]:
life_seconds = 38 * year_seconds
print(life_seconds)

1198368000


## Text Strings

You will use text just as much as numbers when programming. Strings are created with either single quotes `'` or double quotes `"`:

In [30]:
print("A string with double quotes")

print('A string with single quotes')

A string with double quotes
A string with single quotes


Having two possible quotation signs allows for having strings with quotes inside:

In [32]:
print("The name of our school is 'WBS CODING SCHOOL', which can be abbreviated to 'WBSCS'")

print('The name of our school is "WBS CODING SCHOOL", which can be abbreviated to "WBSCS"')

The name of our school is 'WBS CODING SCHOOL', which can be abbreviated to 'WBSCS'
The name of our school is "WBS CODING SCHOOL", which can be abbreviated to "WBSCS"


If you need to create a multiple-line string, you can use triple quotes:

In [33]:
poem = """Hold fast to dreams
For if dreams die
Life is a broken-winged bird
That cannot fly.
Hold fast to dreams
For when dreams go
Life is a barren field
Frozen with snow."""

print(poem)

Hold fast to dreams
For if dreams die
Life is a broken-winged bird
That cannot fly.
Hold fast to dreams
For when dreams go
Life is a barren field
Frozen with snow.


You can create a string out of another data type with `str()`:

In [34]:
str_88 = str(88)
print(str_88)

88


In [35]:
type(str_88)

str

By using the backslash `\` preceding a character, we "escape" its usual meaning and give a special meaning to it. 

For example, `\n` will add a new line, and `\t` will add a tab:

In [36]:
print("This is the first line\nAnd this is the second one.")

This is the first line
And this is the second one.


In [33]:
print("Before the tab\tAnd after the tab.")

Before the tab	And after the tab.


Similarly, you can escape quotes by using `\`:

In [40]:
fact = "The world's largest rubber duck was 54'2\" by 67'7\" by 105'"
print(fact)

The world's largest rubber duck was 54'2" by 67'7" by 105'


If you need to use a backslash inside your string, you type `\\`: the first backslash escapes the second one:

In [43]:
print("Please contemplate a single, one —and only one— backslash: \\")

Please contemplate a single, one —and only one— backslash: \


By typing `r` before the quotes, we denote a string as a _raw string_, which will ignore any special meaning:

In [45]:
raw_str = r"This will not create a new line \n and here we will see two backslashes \\"
print(raw_str)

This will not create a new line \n and here we will see two backslashes \\


You can combine strings using `+` and multiply them with `*`:

In [46]:
print("Data" + "Science")

DataScience


In [47]:
print("Data"*10)

DataDataDataDataDataDataDataDataDataData


Using `[]`, you can grab certain characters from a string by specifying their position, either starting from the first element `[0]` or from the last one `[-1]`:

In [48]:
letters = "abcdefghijklmnopqrstuvwxyz"

letters[0]

'a'

In [52]:
letters[2]

'c'

In [50]:
letters[-1]

'z'

In [51]:
letters [-4]

'w'

You can _slice_ a string using `[start:end]`. Note that the start is inclusive, but the end is not:

In [59]:
letters[1:5]

'bcde'

If you don't specify the start, the slice will start at the beginning of the string. Same goes for the end:

In [54]:
letters[:4]

'abcd'

In [56]:
letters[4:]

'efghijklmnopqrstuvwxyz'

Counting how many characters are there in a string can be done with `len()`:

In [60]:
len(letters)

26

The `split` function lets you brake a string into smaller strings at a certain character:

In [61]:
jobs = "Data Scientist, Data Analyst, Data Engineer, Business Analyst, Marketing Analyst, Analytics Consultant"

In [73]:
jobs

'Data Scientist, Data Analyst, Data Engineer, Business Analyst, Marketing Analyst, Analytics Consultant'

In [62]:
jobs.split(",")

['Data Scientist',
 ' Data Analyst',
 ' Data Engineer',
 ' Business Analyst',
 ' Marketing Analyst',
 ' Analytics Consultant']

You can substitute a character using `replace()`:

In [66]:
jobs.replace("Data", "Duck")

'Duck Scientist, Duck Analyst, Duck Engineer, Business Analyst, Marketing Analyst, Analytics Consultant'

You can specify how many times do you want to make that replacement:

In [67]:
jobs.replace("Data", "Duck", 2)

'Duck Scientist, Duck Analyst, Data Engineer, Business Analyst, Marketing Analyst, Analytics Consultant'

A very common process in data cleaning is to remove preceeding and trailing spaces from strings. The function `strip()` does exactly that:

In [71]:
wow = "                     wow                     "
print(wow)

                     wow                     


In [72]:
wow.strip()

'wow'

There are many more string methods in Python. Browse through them here: https://www.w3schools.com/python/python_ref_string.asp 

About 99.9% of the time, coding does not mean memorizing functions and knowing by heart how to use them, but being quick in finding the function to need, discovering how to use them reading the documentation or simply adapting coding examples from the internet to meet your needs. 

In the exercises below, you will have to use several Python string methods, in combination with what you've learned here. Use the link from _w3schools_ or Google around to complete them.

**Exercises:**

1. Create a Python string with the following text, and assign it to the variable `data_science`:

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains. 

Data science is related to data mining, machine learning and big data. Data science is a "concept to unify statistics, data analysis, informatics, and their related methods" in order to "understand and analyze actual phenomena" with data. It uses techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, information science, and domain knowledge. However, data science is different from computer science and information science. Turing Award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational, and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the data deluge.

In [3]:
data_science = 'Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains. Data science is related to data mining, machine learning and big data. Data science is a "concept to unify statistics, data analysis, informatics, and their related methods" in order to "understand and analyze actual phenomena" with data. It uses techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, information science, and domain knowledge. However, data science is different from computer science and information science. Turing Award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational, and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the data deluge.'

2. How many characters are there in the string?

In [4]:
print(len(data_science))

1041


3. Convert all the string to lower case.

In [5]:
print(data_science.lower())

data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains. data science is related to data mining, machine learning and big data. data science is a "concept to unify statistics, data analysis, informatics, and their related methods" in order to "understand and analyze actual phenomena" with data. it uses techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, information science, and domain knowledge. however, data science is different from computer science and information science. turing award winner jim gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational, and now data-driven) and asserted that "everything about science is changing because of the impact of inf

4. How many times do the word "data" appear (in the string you converted to lower case)?

In [6]:
print(data_science.count("data"))

10


5. Separate the string into sentences, breaking it down whenever there is a stop (`.`).

In [7]:
data_science.split('.')

['Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains',
 ' Data science is related to data mining, machine learning and big data',
 ' Data science is a "concept to unify statistics, data analysis, informatics, and their related methods" in order to "understand and analyze actual phenomena" with data',
 ' It uses techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, information science, and domain knowledge',
 ' However, data science is different from computer science and information science',
 ' Turing Award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational, and now data-driven) and asserted that "everything about science is changing becaus

6. Capitalize all the words of the string.

In [8]:
print(data_science.upper())

DATA SCIENCE IS AN INTERDISCIPLINARY FIELD THAT USES SCIENTIFIC METHODS, PROCESSES, ALGORITHMS AND SYSTEMS TO EXTRACT KNOWLEDGE AND INSIGHTS FROM NOISY, STRUCTURED AND UNSTRUCTURED DATA, AND APPLY KNOWLEDGE AND ACTIONABLE INSIGHTS FROM DATA ACROSS A BROAD RANGE OF APPLICATION DOMAINS. DATA SCIENCE IS RELATED TO DATA MINING, MACHINE LEARNING AND BIG DATA. DATA SCIENCE IS A "CONCEPT TO UNIFY STATISTICS, DATA ANALYSIS, INFORMATICS, AND THEIR RELATED METHODS" IN ORDER TO "UNDERSTAND AND ANALYZE ACTUAL PHENOMENA" WITH DATA. IT USES TECHNIQUES AND THEORIES DRAWN FROM MANY FIELDS WITHIN THE CONTEXT OF MATHEMATICS, STATISTICS, COMPUTER SCIENCE, INFORMATION SCIENCE, AND DOMAIN KNOWLEDGE. HOWEVER, DATA SCIENCE IS DIFFERENT FROM COMPUTER SCIENCE AND INFORMATION SCIENCE. TURING AWARD WINNER JIM GRAY IMAGINED DATA SCIENCE AS A "FOURTH PARADIGM" OF SCIENCE (EMPIRICAL, THEORETICAL, COMPUTATIONAL, AND NOW DATA-DRIVEN) AND ASSERTED THAT "EVERYTHING ABOUT SCIENCE IS CHANGING BECAUSE OF THE IMPACT OF INF

7. Find the position of the word "Turing" in the string.

In [9]:
print(data_science.find("Turing"))

770


### Fromatting with f-strings

Strings can be concatenated with `+`. Sometimes, you need to _interpolate_ values into strings. 

Say you work for a logistics company and have a script that calculates the number of trucks that should be made ready for the next day. You want this number to be included in the sentence of an email, like this: `"For tomorrow, we will need x trucks."` But replacing x with your value.

This task is known as formatting strings. It can be confusing to learn this by just browsing the internet, because there are many ways to format strings in Python: the old one (using `%`), the new one (using `{}`and `format()`) and the newest one (using f-strings).

We will only show you how to use f-strings, but be ready to stumble onto the other ones at any time!

In [79]:
n_trucks = 73

email = f"For tomorrow, we will need {n_trucks} trucks."

print(email)

For tomorrow, we will need 73 trucks.


The curly brackets support expressions like those:

In [82]:
place = "main garage"

email = f"For tomorrow, we will need {n_trucks+10} trucks in the {place.title()}."

print(email)

For tomorrow, we will need 83 trucks in the Main Garage.


**Exercise**: Convert the `data_science` string from the previous exercise into an f-string, replacing the name "Jim Gray" by the variable `jim` that we define here:

In [83]:
jim = "James Nicholas Gray"

In [11]:
print(data_science.replace("Jim Gray", "James Nicholas Gray"))

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains. Data science is related to data mining, machine learning and big data. Data science is a "concept to unify statistics, data analysis, informatics, and their related methods" in order to "understand and analyze actual phenomena" with data. It uses techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, information science, and domain knowledge. However, data science is different from computer science and information science. Turing Award winner James Nicholas Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational, and now data-driven) and asserted that "everything about science is changing because of the im