# Introduction to Data Types

This notebook will teach you the different data types that are in Python and many of the same data types you will see here will also show up in other programming languages. Data types are important to know because each data type has special properties unique to that particular data type which can affect how your program runs.

In this notebook we will focus on the main ones you are most likely to use and are the basic building blocks for more advanced concepts. These data types are:

1. $\textbf{Numerical}$

    a. Integer
    
    b. Float


2. $\textbf{Boolean (True/False)}$

3. $\textbf{String (Words/Characters)}$

## Data-Type Number 1: Numerical

The cool thing about python as opposed to a languange like C++ is that you do not need to specify the data type. Python is smart enough that it can figure it out for you and will auto assign the data type to your value.

In [None]:
#This value will be interpreted by python as an integer
5

In [None]:
#This value will be interpreted by python as a float
3.14159

In [None]:
#This value will be interpreted by python as an integer
12

In [None]:
#This value will be interpreted by python as a float
12.34

In [None]:
#This value will be interpreted by python as an integer
15

In [None]:
#This value will be interpreted by python as a float
23.432

In [None]:
#This value will be interpreted by python as a float
.5

---
To check if these are correct, there is a Python command called $\textit{type}$ that tells you the data type of whatever you provide it.

Lets try this function on all the numbers above and see what we get.

In [None]:
print('5 is type:', type(5))
print()
print('3.14159 is type:', type(3.14159))
print()
print('12 is type:', type(12))
print()
print('12.34 is type:', type(12.34))
print()
print('15 is type:', type(15))
print()
print('23.432 is type:', type(24.432))
print()
print('0.5 is type:', type(.5))

While the cell above shows different numerical values, what they all have in common is that they are all numbers. One thing to note is that while they are all numbers the way python interprets each of these numbers is different.

## Quick Question

Can you tell from the cell above $\textbf{why}$ some values are classified as Integers and others as float?

Let's look at the following example for a subtle check:

In [None]:
print('12 is type:', type(12))
print()
print('12.34 is type:', type(12.34))

In [None]:
print('12 is type:', type(12))
print()
print('12.34 is type:', type(12.34))

print('12. is type:', type(12.))
print()
print('12.34 is type:', type(12.34))

Hopefully you see that the main thing that decides if a value will be an integer or float is the inclusion of a decimal point. 

## What can we do with Numerical Data?

The answer to this is that we can perform mathematical operations, the same mathematical operations you may have seen in your math class such as, addition, subtraction, multiplication, division and raising to a power. These same operations can be done with python and we will go over the syntax for it down below with some examples.

In [None]:
#add

print(4+6)
print(4+6.)
print(4+12.7)

#subtract

print(4-6)
print(4-6.)
print(4-12.7)

#multiply

print(4*6)
print(4*6.)
print(4*12.7)

#divide

print(4/6)
print(4/6.)
print(4/12.7)

#raise to a power, note that this is different from lots of 
#other languanges which use ^ as the raising power operator

print(4**6)
print(4**6.)
print(4**12.7)

In [None]:
#Python follows PEMDAS: 

print( (4+2) * 5)
print( (4**2 + 2) * 5)
print( (4**2 / 2) * 5)
print( (4/2)**2 * 5/.5)

In [None]:
#Introducing Shorthand notation

#defining a variable x which is 10
x = 10

print(x)

#shorthand 1
#similar to x = x + 5
x += 5
print()
print(x)

#shorthand 2
#similar to x = x - 5
x -= 5
print()
print(x)

#shorthand 3
#similar to x = x * 5
x *= 5
print()
print(x)

#shorthand 4
#similar to x = x / 5
x /= 5
print()
print(x)

## Mathematical Notation Summary:

Addition Operator: +

Subtraction Operator: -

Multiplication Operator: *

Division Operator: /

Raise to a Power Operator: **

Shorthand:
    1. y += x is the same y = y+x
    2. y -= x is the same y = y-x
    3. y *= x is the same y = y*x
    4. y /= x is the same y = y/x

# Some More Division Operations

Above we covered the basic division which is the one we are all very familiar with, python and other programming languages have other division operations that may be useful in certain contexts and it is worthwhile covering them here. The two are: 

1. Remainder division
2. Floor Division

$\textbf{Remainder division:}$ is a type of division that takes two number divides them as normal but instead of returning the mathematical division of the two numbers this type of division returns the remainder. The operator for this division is the percent sign %. An example of it in use is the following: 11%2 will return 1 because 2 goes into 11, 5 times and the remainder is 1.

$\textbf{Floor division:}$ is the otherside of remainder division where it will give you the whole number that the numbers divide into. The python syntax for this division is the double backslash //. An example of it in use is the following: 11//2 will return 5 because 2 goes into 11, 5 times.

$\textbf{NOTE:}$ Floor division does not round

In [None]:
#remainder Division
print(5%2)

#floor division
print(5//2)

#regular division
print(5/2)

In [None]:
#NOTE: Floor Division do not round up the value, it just gives you the total number of times a number can go into 
#another number
print(109//10)
print()
#while this evaluates to 10.9 it will NOT round to 11

# Data-Type Number 2: Boolean

The next data type we will encounter is called Boolean, which is a fancy word for saying truth values and truth values can be either $\textbf{True}$ or $\textbf{False}$. Let us see what happens when we run the following cell and see what True and False evaluate to.

In [None]:
print('True is type:', type(True))
print()
print('False is type:', type(False))

As we can see from the cell output above these $\textbf{True}$ and $\textbf{False}$ values are a datatype called bools. These kind of datatypes become very important once you want your code to execute if certain criteria are met.

This is very useful later on when we talk about conditional statements as Booleans play a crucial role in running some parts of your program instead of others. 

## So how do we use them in python?

There are many ways of using bools and we will cover some of the ways in the next section. 

1.  Directly assigning the values of $\textbf{True}$ or $\textbf{False}$ to a variable. 
2.  Evaluating expression as these expression will lead to $\textbf{True}$ or $\textbf{False}$ statements. 
    We will also cover how you can directly store the output of an expression into a variable for you to use later

In [None]:
#1. Direct assignment
MyVariable = True
Statement = False
News = True

In [None]:
#2. Evaluating Expressions
print(4 > 0)
print(4 < 0)
print(4 == 0)
print(4 != 0)

In [None]:
#We can assign expression evaluations to variables
expression1 = 4 > 0
expression2 = 4 < 0
expression3 = 4 == 0
expression4 = 4 != 0

In [None]:
print(expression1)
print(expression2)
print(expression3)
print(expression4)

# Expression Syntax

For us to use booleans to our advantage we need to know the syntax for evaluating expression. These expression are the same expressions that you have seen in your math class but with a slightly different notation. 

The expressions people typically will use are to 

1. check if one value is greater than another value, or greater than or equal to a value. 
2. check if a value is less than another or less than or equal to another value. 
3. check to see if two values are the same. 
4. check if they are not the same.

In the following code cell we show the syntax for each of the expressions above.

In [None]:
#Python Syntax for: 

#greater than (>)

print(5 > 0)

#greater than or equal to (>=)

print(5 >= 0)

#less than (<)

print(5 < 0)

#less than or equal to (<=)

print(5 <= 0)

#equal to (==)

print(5 == 0) 

print(5 == 5)
#NOTE: in python we need to use two equal sign to set up a boolean expression 
#a single equal sign will just assign the variable 5 to be equal to 0

#not equal to (!=)
print(5 != 0)

In [None]:
#The order matters and also the spaces matter, what happens when you run the following 2 cells?

print(5 > = 0)

In [None]:
print(5 < = 0)

You should have gotten two error messages in the above two cells. The only way to fix this error is to remove the spacing.

In [None]:
#Careful when using equal (==) between an int and float sometimes you will get weird results
#print this out and see what you get
print(12 == 12.0000000000000001)

# Explanation:

You are probably wondering why 12 == 12.0000000000000001 evalluated to True when we can clearly see that they are two different values and should evaluate to False. The reason for this is that computers are prone to doing floating point errors as in the case here. If you ever are faced with a situation where you need to compare an integer value to a float it is recommended to apply a threshold and use that threshold to then see if the two values are the same. Below is an example code of this:



In [None]:
tolerance = 1e-3 # this is how close we want the two values to be

int12 = 12
float12 = 12.0000000000000001

difference = float12-int12 

print(difference < tolerance)

# Combining Expressions

With boolean expressions you have the ability to build them in complexity and combine expression to get a final result. This combination of expressions is possible through the introduction of two operators we will cover in this section and these are the $\textbf{and}$ and $\textbf{or}$ operators. 

The $\textbf{and}$ operator takes two expression and only evaluates to $\textbf{True}$ if both expressions are $\textbf{True}$, otherwise it returns $\textbf{False}$. The $\textbf{and}$ operator can be performed by writing out the word $\textbf{and}$ in Python and in a jupyter notebook it should turn green indicating that python recognizes you want to apply the $\textbf{and}$ operator. You can also use the ampersand symbol & to let python know you want to perform an $\textbf{and}$ operation.

the $\textbf{or}$ operator takes two expressions and evaluates to $\textbf{True}$ if either expression is True, it will only return $\textbf{False}$ if both expressions are $\textbf{False}$. The $\textbf{or}$ operator can be performed by writing out the word $\textbf{or}$ in Python and in a jupyter notebook it should turn green indicating that python recognizes you want to apply the $\textbf{or}$ operator. You can also use the vertical line symbol | (located below the delete/backspace button) to let Python know you want to perform an $\textbf{or}$ operation.

Let us see some examples of how we can combine expressions using the $\textbf{and}$ and $\textbf{or}$ operators. 

In [None]:
#Defining a variable x to be 50
x = 50

#and (&): needs both expressions to be true to evaluate to True
print((x > 10) and (x < 40))

print((x > 10) and (x < 90))

print('----------------')


print((x > 10) & (x < 40))

print((x > 10) & (x < 90))

print('----------------')

#or (|): just needs one of the expressions to be true to evaluate to True even if the other one is False
print((x > 10) or (x < 40))

print((x < 10) or (x < 40))

print('----------------')

print((x > 10) | (x < 40))

print((x < 10) | (x < 40))

In [None]:
#As in the previous section you can store these expressions into variables

#and (&): needs both expressions to be true to evaluate to True
expression5 = (x > 10) and (x < 40)

expression6 = (x > 10) and (x < 90)

print(expression5)
print(expression6)

print('----------------')


expression7 = (x > 10) & (x < 40)

expression8 = (x > 10) & (x < 90)

print(expression7)
print(expression8)

print('----------------')

#or (|): just needs one of the expressions to be true to evaluate to True even if the other one is False
expression9 = (x > 10) or (x < 40)

expression10 = (x < 10) or (x < 40)

print(expression9)
print(expression10)
                     
print('----------------')

expression11 = (x > 10) | (x < 40)

expression12 = (x < 10) | (x < 40)

print(expression11)
print(expression12)

# Note on combining expressions

In the previous section we covered on combining two expressions there may be times where you are asked to do 3 or more expressions and if you are combining $\textbf{and}$ and $\textbf{or}$ statements make sure to use parenthesis around the expressions you want to evaluate correctly.

Let us take the following example: let us say that you are given a data set of stars and you are asked to get stars within a range of coordinates, say RA has to be greater than 20 but less than 50 and DEC is between 45 to 50 and you could get either Spectral type G stars or O stars.

Let us assume that they store the RA in a variable RA and DEC in variable DEC and the spectral type in a variable spectraltype

In [None]:
#Note the use of parenthesis to isolate all the conditions and how they will get evaluated as PEMDAS still 
#Applies here with everything with parenthesis getting evaluated first

#First method
((20 < RA) & (RA < 50)) & ((45 < DEC) & (DEC < 50)) & (spectraltype == 'G' | spectraltype == 'O') 


In [None]:
#second method

RA_Condition = ((20 < RA) & (RA < 50))
DEC_Condition = ((45 < DEC) & (DEC < 50))
spectype_Condition = (spectraltype == 'G' | spectraltype == 'O') 

RA_Condition & DEC_Condition & spectype_Condition

# Data-Type Number 3: String

The next data type is a string and it is fancy way of saying letters and words. You can store letters and words into python as a string and the way to do that is to use single or double quotation marks. Strings show up all over the place in astronomy research as you will need to read in files with unique names, you may need to use them to look up a source in a table. Let us go over defining strings in the next couple of cells.

In [None]:
#storing the string Oscar to the variable name, Note the use of single quotations marks to let Python know 
#that this is a string, if you type it without it you will get an error
Name = 'Oscar'
type(Name)

In [None]:
#storing the string Gene to the variable name, Note the use of the double quotations marks to let Python know 
#that this is a string. Both single quotation and double quotations can be used to denote strings in Python
Name1 = "Gene"
type(Name1)

In [None]:
Sentence = "I am learning Python, It is really fun :D!!!"
print(Sentence)

## What if what you want to display needs Quotations?

To do this you will need to use double quotation marks to outline the string and then you can use single quotation marks for Quotes, like the next cell shows.

In [None]:
Quote = "The scientist said, 'global warming is a serious threat'!!"

print(Quote)

# What can you do with Strings?

# 1 Weirdly Enough Some Math :^O!!

There is a whole slew of operations that you can do with strings that involve applying mathematical operators to them. The first one we will cover is concatenating two strings through the addition operator. 

When applied to strings, the addition operator ($\textbf{+}$) should $\textbf{NOT}$ be throught of as the same operator as for numerical data types. Instead you should think of this operator, when applied to strings, as a concatenator, merging the strings together into one string.

You can also apply the multiplication operator to strings but again this should not be thought of as the same operator being applied to numerical data. When applied to strings the multiplication operator is a way to merge the same text x number of times. An example of this is:

'text' * 3 = 'text + 'text' + 'text' = 'texttexttext'

In [None]:
#Concatenating two strings
'Good' + 'Morning'

In [None]:
#How would you fix this so that there is a space between Good and Morning?

#Solution 1: Introduce a space after Good
print('Good ' + 'Morning')

#Solution 2: Introduce a space before the M in Morning
print('Good' + ' Morning')

#Solution 3: Introduce an empty string between Good and Morning and concatonate along that
print('Good' + ' ' + 'Morning')

In [None]:
#multiplication(but really it is just quick addition)
'Good' * 5

In [None]:
'Morning' * 3

## 2. Indexing

Another thing you can do with strings is to subselect a certain part of the string using the index location. Indexing is a way of accessing a certain element within a string or later list and arrays. We need its location to access the element that we are interested. Python is a unique language in that it uses 0th indexing, which means that the 1st element in the string is actually index 0, instead of index 1. So let us say we have a string like "Finkelstein", you can think of indexing as every letter in "Finkelstein" being associated with an index based on zero so the index letter association for "Finkelstein" is outlined below:

F = 0, 

i = 1, 

n = 2, 

k = 3, 

e = 4, 

l = 5, 

s = 6, 

t = 7, 

e = 8, 

i = 9, 

n = 10

We can see that the first letter F is index 0, then i is at index 1 and so on. Let us use this knowledge of indexing to get subsections of strings in the next section.

To access an element of a string you use the square brackets and the index location of the element you want to get. Let us say we want to get the t in "Finkelstein" the way we would do that is:

name = "Finkelstein"

name[7]



In [None]:
#How would I get the O in Oscar?

Name = 'Oscar'

#Your Answer Here


In [None]:
#How would I get the r in 'Oscar' ?

#Method 1
print(Name[4])

#Method 2
print(Name[-1])

# Introducing Negative Indexing

$\textbf{Comment:}$ So in the cell above I introduced negative indexing and this is a very convenient way to get items going in reverse order. Say you are writing a code that requires you to get the last item of a string you can use negative indexing to easily get it. Here Python uses the convention that -1 refers to the last item, -2 refers to the second to last item, -3 third to last, etc. 

In [None]:
#How about getting the c? Try this out yourself

#Using Method 1
print(Name[])

#Using Method 2
print(Name[])

## 2. Slicing

Slicing is a cool way to get multiple values from a string really easily. The way to do this is by modifying the index notation we learned above. Let us say you have the following code block:

Book_Title = "To Kill a Mockingbird"

And you want to access just a piece of the title like "Mockingbird" to do that you would need to know the starting index for the word Mockingbird, which is figuring out which index the M corresponds to, then count up to the end of the word and add 1.

The notation for slicing is shown below with two index values and a colon in between them, which lets python know you only want the elements between starting index and ending index

slice = Book_Title[Starting_Index:Ending_Index]

The reason why you add one is because the ending index for slicing a string is not inclusive, so you need to add 1 to make sure you get the right word you want otherwise you will be missing one letter. This can also be found through trial and error.

In [None]:
#How would I get the 'Osc' in 'Oscar'? 

# O  S  C  A  R
# 0  1  2  3  4

print(Name[0:3])

print()

#NOTE: By default if you are slicing and you do not specify a starting point python assumes to start
#from the beginning so the code below is the same as above
print(Name[:3])

print()
#You can also use negative indexing when you are slicing but just note that the ending index is not inclusive
#this means we want to stop at the A which is the second to last element
print(Name[:-2])

In [None]:
#How would I get the 'car' in 'Oscar'? 

print(Name[2:5])

print()

#By Default if you do not specify an ending index Python assumes
#you want to go until the end of the string
print(Name[2:])

print()
#You can also use negative indexing when you are slicing
print(Name[-3:])

In [None]:
#Try to get the 'sca' in 'Oscar'? 

print(Name[ : ])

## 3. Skipping

Skipping is a very niche topic but its use is worth mentioning as the same indexing, slicing and skipping concepts we go over will be repeated when we go over lists and arrays.

To skip elements in a string you use the following syntax:

Text[start_index : end_index : number_of_indexes_to_skip]

In [None]:
#How would I get 'Ocr' from 'Oscar'?

#Easy way:
print('Easy Way')
print(Name[0] + Name[2] + Name[4])
print()

#Skipping
print('Using Skipping')
print(Name[ : : 2])

In [None]:
#mess around with skipping using the Quote variable, which we defined above. 
#start by changing up the starting and ending index and the 
#number of indices to skip and see what you get
Quote[ : : ]

# Final Remarks

So in this notebook we covered 3 data types: 

1. Numerical
3. Boolean
4. Strings

And ways to use them in python. We covered mathematical operations that you can apply to numerical data and oddly strings as well. We learned how expressions evaluate to give us boolean values and how we can combine expressions using $\textbf{and}$ and $\textbf{or}$ and we learned about indexing, slicing and index skipping. All the skills here will build up in the next section so take your time going through this and understanding each of the concepts covered. 