# Data Types

In Python, everything is considered to be some *type* of object.

```{margin}
An **object** is just Python's way of saying "a thing". Since everything is data in a computer's eyes we call the types of these objects 'data types'.
```

We can easily figure out what type of object something is by using the `type` function.

In [None]:
type(1)

In [None]:
type('hello')

In [None]:
type(True)

Even functions, like `sum`, are a type of object! We'll talk about this at the end.

In [None]:
type(sum)

## Primitives

Some objects are really basic, you could even call them "primitive".

The four primitive data types are:
- Integers
- Floats
- Strings
- Booleans

and they correspond to the types of data we see most often: numbers, text, and true/false.

These objects essentially have intrinsic value -- they are raw data.

### Numbers - Ints and Floats

Turns out, there's actually a distinction between two main types of numbers **integers** and **floats**.

Just like in math class, **integers** must be whole numbers without any decimals or fractions. These can be big, small, positive, negative, or zero.  Integers have the type `int`.

In [None]:
type(4)

In [None]:
type(0)

In [None]:
type(-10968298760848092360872)

Real numbers are called **floats**, and are basically any number with a decimal point. Real numbers have the type the type `float`.

```{margin}
The term 'float' comes from "floating point number", since they're respented by the computer as a value and the position of the decimal point -- so the decimal point can 'float' to any position in the value.
```

When floats get really big or really small they might be printed in scientific notation. You can write floats in scientific notation, too.

In [None]:
type(4.0)

In [None]:
type(-5987.63644)

In [None]:
-12345678909876654321.0

In [None]:
type(1e-10)

In [None]:
type(1 / 1)

Did you catch the strange thing that happened in the last code cell?

#### When `int` and `float` mix

Recall that expressions within parenthesis will be evaluated first, so the `type` function is actually being called on the result of the expression `1 / 1`.  But, wouldn't we expect that any number divided by itself is just 1... an integer?

Python will always return a float if an expression involves a float, or if integers are being divided.

In [None]:
1 + 2.0

In [None]:
4 / 2

If we want to convert the result into specifically an int or a float, we can use the `int` or `float` functions.

In [None]:
int(1 + 2.0)

In [None]:
float(1 + 2)

But be aware that converting a float to an int will essentially just drop anything after the decimal point -- it will not round.

In [None]:
int(1 + 2.5)

#### A note on precision

In [None]:
12345678901234567890.1234567890123456789

Integers can be any length (as long as you don't run out of memory on your computer!), and will remain precise to the nearest integer.

Floats, however, only remain precise up to ~15-16 digits.

In [None]:
# This integer can be as long as we want
# and it'll still be precicse to the nearest integer.
123456789012345678901234567890

In [None]:
# This float will cut off after the 16th digit
1.23456789012345678901234567890

In [None]:
# Same with this one
1234567890.12345678901234567890

In [None]:
# And this one
12345678901234567890.1234567890

Because floats lack some precision, small arithmetic errors called "floating point errors" can result from float operations.

In [None]:
3.0 * 1.2

This also means that sometimes something that *should* be zero doesn't seem to be zero, but instead appears to be a super small number.

In [None]:
(3.0 * 1.2) - 3.6

### Text - Strings

**Strings** are what most programming languages call text. In Python, once text is surrounded by single quotes `' '` or double quotes `" "` then it's treated as a string. Strings have the type `str`.

In [None]:
type('Word')

In [None]:
type("More than one word, and punc-tu-a-tion!")

What happens if you don't put anything inside of the quotes?

In [None]:
type('')

This is fine! We call it an empty string.

#### Single vs. double quotes

What if we wanted to turn the following into a string?
> I'm having a great day

Notice that the text itself includes a single quote (an apostrophe), so trying to wrap it in single quotes won't work. Python things that the string ends the moment a second single quote is found.

In [None]:
'He asked if I'd like to use an apostrophe'

The same problem arises if we try to use double quotes around text that includes double quotes.

In [None]:
"He said a "double-quote" was fine to use"

There are two ways we can fix these issues.

1. We can use whichever quotation mark *doesn't* show up in our text. Often times people wrap strings with double quotes since its common for apostrophes to show up in text.

In [None]:
"He asked if I'd like to use an apostrophe"

In [None]:
'He said a "double-quote" was fine to use'

2. We can 'escape' the character by prefixing it with a backslash `\`.  This will tell Python to treat the character differently than it normally would -- in this case, it tells Python not to end the string.

    This is most helpful when both single quotes *and* double quotes appear in the string!

In [None]:
'He said, "escaping isn\'t so bad," and I believe him!'

```{note}
Using an escaped character is also used to input special characters like a new line `\n` or a tab `\t`
```

#### Combining strings

Interestingly, we can add strings together using the addition operator `+`.  This basically glues the strings together, and is commonly called **concatenation**.

In [None]:
"one fish" + "two fish" + "three fish"

````{hiddenanswer}
---
question: |
    Given the following variables, write an expression that concatenates the two strings and adds a space in between. The output should be `'red fish blue fish'`

    ```
    string1 = "red fish"
    string2 = "blue fish"
    ```
answer: |
    ```
    string1 + ' ' + string2
    ```
````

#### String methods

Finally, once you create a string, that string posesses an extra set of functions that are unique to strings.

```{margin}
When a specific type of object has its own set of functions, we call those functions 'methods'.  You'll see in the next chapter that more complex data types usually have lots of methods!
```

Methods can be called directly on a string, or the variable name of a string. The following methods return the string they're called on, but with different capitalization -- they can be very useful for when you're cleaning data.

In [None]:
my_string = "JuSt A sTrInG"

In [None]:
my_string.lower()

In [None]:
my_string.upper()

In [None]:
my_string.title()

Notice that a method is accessed by placing a dot after the string, and then calling the function name.

```{margin}
This is commonly called **dot notation**. It indicates that the whatever comes after the dot *belongs* to the object before the dot.
```

The `replace` method is extremely powerful, since it allows us to find and replace sections of a string. The previous string methods we looked at took no arguments, but the `replace` methods takes two arguments: *the text to find*, and *the text to replace it with*.

In [None]:
'found you'.replace('you', 'Waldo')

Remember the empty string `''`? It's used a lot with `replace` in order to get rid of parts of text entirely! Notice that the text must match *exactly*, and is case sensitive!

In [None]:
'where\'s Waldo'.replace('w', '')

Since the string methods we've looked at return more strings, we can even call more string methods on the result!

In [None]:
s = 'started with words'
t = s.replace('started', 'ended')
u = t.replace('words', 'a sentence')
v = u.capitalize()
w = v + '.'
w

### True/False - Booleans

A **Boolean** (named after [George Boole](https://en.wikipedia.org/wiki/Boolean_algebra)) is a logical data type, indicating whether something is True or False. It has the type `bool`.

In [None]:
type(True)

#### Comparisons

Boolean values result when we use comparison operators to compare the value of two expressions.

The standard set of comparisons operators carries over from math,
- Less than `<`
- Less or equal `<=`
- Greater than `>`
- Greater or equal `>=`
- Equal `==`
- Not equal `!=`

Notice that the equal comparison operator distinguishes itself from the assignment operator by using *two* equal signs.

In [None]:
1 == 0

Any expressions to the left or right of the comparison operator will be evaluated before the comparison is carried out.

In [None]:
(3 * 4) / 6 < 1 + 2 + 3 + 4

And if there are multiple comparison operators in an expression, then each comparison must evaluate to True in order for the entire expression to be True.

In [None]:
1 < 0 + 2 < 3

In [None]:
1 != 3 <= 2

Since `3 <= 2` is False, the above expression evaluates to False.

We can use comparison operators on all sorts of things! For example, we can use the `==` and `!=` to check if *any* objects are equal in value.

In [None]:
'Ronaldo' == 'Waldo'

In [None]:
True != False

In [None]:
sum == sum

And many objects support greater than/less than comparison too. For instance, a string is less than or greater than another string based on alphabetical order.

```{margin}
Technically, string comparisons compare using **lexicographical** order, which just means that text including numbers and symbols is also ordered.
```

In [None]:
"Avocado" < "Banana" < "Cantaloupe"

In [None]:
"1. Learn about Python" < "2. Learn about data science" < "3. Profit"

Notice that if you look at a dictionary, words like "Fire" will show up before "Fireplace" -- the same holds true with string comparisons in Python.

In [None]:
"Fire" < "Fireplace" < "Fireplaces"

## A new type of error -- TypeErrors

We saw that we can use the addition operator to add two strings. But what happens if we try to add a number and a string?

In [None]:
1 + '2'

We've successfully stumbled on another very common error, the `TypeError`. This happens when we try to use an operator on a type of object that doesn't support it!

Another type error (with a different explanation) arises if we try switching the order of the string and number.

In [None]:
'2' + 1

Or if we try to use an operator that strings don't understand, like subtraction.

In [None]:
'2' - '2'

Sometimes we'll get a type error when using a function call on an object of the wrong type, too. For example, if we try using a math function on a string.

In [None]:
import math
math.log('2')

```{tip}
Notice that the explanation that follows the `TypeError` at the very bottom will almost always explain the data types of your operands -- which is helpful to figure out exactly what variables or values lead to the code breaking!
```

## Other data types

Once we move past primitives, our objects start getting more complex!

Remember, everything has a type. We saw at the beginning of this page that even functions are a type of object.

Besides functions, the most common types of objects we'll observe will act mostly like containers for raw data. Examples of these container-like data types include lists, arrays, and tables -- all of which are extremely useful and you'll soon know!

---
## Summary

Everything in Python has a type -- these are called **data types**.

We can find the type of an object by calling `type` function on an object or expression.

There are four **primitive** types that represent raw data:
- **Integers** `int` are whole numbers
- **Floats** `float` are numbers with decimals
- **Strings** `str` are text
- **Booleans** `bool` are True/False

When faced with division or an expression that involves any floats, the end result will be a float.

Multiple strings can be glued together using `+`.

Strings own a handful of **methods** -- functions that belong solely to the data type of strings.

Methods are called using **dot notation**, by placing a dot after a string or variable name of a string, then calling the function: `my_string.function_name(arguments, ...)`

Some string methods allow you to create new strings that change capitalization or find and replace snippets of text.

Lots of objects can be compared using **comparison operators**, `<` `<=` `>` `>=` `==` `!=`, which will return a boolean value.

Trying to perform an operation on data types that don't support that operation will often result in a **TypeError**.

Most other data types that aren't primitive are either functions or act like containers for primitive types.