# Primitive data types


## A: Strings

Like any data type, we can save a string as a variable through the following syntax:

var_name = "our string".

Strings must be surrounded in one of the following quote patterns to be recognized by python as a string.

- 'single quotes'
- "double quotes"
- '''triple single quotes'''
- """triple double quotes"""

In [1]:
# Let's try it out: (this is a comment! It starts with a #, and these lines are not run.)

our_first_string = "hello world!"

By default, notebook editors (like Google Colab or Jupyter Notebooks) will print out the last line run in a cell, so we can view the contents of our_first_string below:

In [2]:
our_first_string

'hello world!'

In [3]:
print(our_first_string)

hello world!


Great, we can see that we succesfully saved this as a variable. We can check the variable's type really easily in python using the type() function, like this.

In [4]:
type(our_first_string)

str

What if we want to save and then print out multiple strings? We can't do this with the notebook's "print last line" capability alone, as you'll see below:

In [5]:
our_second_string = "another string"
our_third_string = "even more strings!"

our_second_string
our_third_string

'even more strings!'

Since it only wants to print the last line, we can use python's print() function to write out anything we want.

This will be extremely useful for debugging more complex data processing and analysis later on.

In [6]:
print(our_second_string)
print(our_third_string)
print(2+2)

another string
even more strings!
4


Important note: be aware of the difference between a VARIABLE (such as our_second_string) and just printing out a string of that name, like "our_second_string". See below:

In [7]:
print(our_second_string)
print("our_second_string")

another string
our_second_string


### String operations and methods

We can **concatenate** strings together using the "+" operator. This is the same operator we'll use to add numbers together, but it's extremely useful for "filling in the blanks" of sentences based on data. See below!

In [8]:
uva_state = "Virginia"
uva_city = "Charlottesville"
uva_funding = "public"
uva_size = "sixteen thousand ish"

In [9]:
uva_sentence = "UVA is a " + uva_funding + " school in " + uva_city + ", " + uva_state + " that has around " + uva_size + " undergraduate students. "

In [10]:
print(uva_sentence)

UVA is a public school in Charlottesville, Virginia that has around sixteen thousand ish undergraduate students. 


We can also **multiply / duplicate strings** by multiplying them by an integer (we'll learn more about integers in a bit!)

In [11]:
print("threetimes"*3)

three_times = "threetimes "
print(three_times*3)

threetimesthreetimesthreetimes
threetimes threetimes threetimes 


It's often useful to know the LENGTH of a string, such as when checking the validity of a phone number string ("XXX-XXX-XXXX"), or checking for any unexpected empty strings (length of 0).

We can do this easily with the len() function.

In [12]:
empty_string = ""
short_string = "hey"
long_string = "hey there! how's it going?? what nice weather we're having :')"

In [13]:
print(len(empty_string))
print(len(short_string))
print(len(long_string))

0
3
62


#### More string methods

There are a wide variety of methods we can apply to our strings to alter them or perform certain "checks" on them, such as determining if the string is all numeric, all alphabetical, etc.

Most important takeaway is that you should **NOT** memorize these.

We'll show some examples below, but finding the full list is as simple as googling "python string methods."

In [14]:
#Converting case.
ds = "dS is dAta sCIENCe"

print(ds.upper())
print(ds.lower())
print(ds.title())
print(ds.capitalize())

DS IS DATA SCIENCE
ds is data science
Ds Is Data Science
Ds is data science


In [15]:
#Replacing certain parts of a string

ds2 = "Cool Kids Study Data Science"
print(ds2.replace("Science", "Analysis"))

Cool Kids Study Data Analysis


In [16]:
ds3 = "Cool Kids Study Data Science"
print(ds3.replace("Science", ""))

Cool Kids Study Data 


**Important note on string methods:***

Most of these string methods and operations DO NOT change the original string! They simply return a COPY of that string, modified according to the method you specified.

You can then save that new copy to a new variable, or save it over the original variable, if you want!

In [17]:
initial_var = "data is cool"

#Converting to uppercase
initial_var.upper()

#Original variable isn't changed.
initial_var

'data is cool'

In [18]:
#But, if we wanted to save the changed variable to a new variable:

initial_var = "data is cool"

modified_var = initial_var.upper()

print("Initial: " + initial_var)
print("Modified: " + modified_var)

Initial: data is cool
Modified: DATA IS COOL


In [19]:
#If we want, we can even save over the original variable instead of creating a new one.
initial_var = "data is cool"

initial_var = initial_var.upper()

print("Initial, written over itself: " + initial_var)

Initial, written over itself: DATA IS COOL


## B: Numeric types (integers and floats)

The "int" (integer) type is named pretty intuitively. These are numbers without any decimal values, and will be displayed and behave as such.


In [21]:
print(type(5))

<class 'int'>


For representing numbers with more specificity (i.e., decimal points are required), we use floating-point numbers, or the **float** data type.

In [22]:
print(type(5.0))

<class 'float'>


Notice that while the 'literal' syntax for representing a string involved placing it between quotes, int and float types cannot be between strings. If they are, they'll be represented by python as strings!

In [23]:
print(type("5"))
print(type("5.0"))

<class 'str'>
<class 'str'>


### Operations with numeric types

In python, we're able to utilize all the usual mathematical operators, plus a few you might not be as familiar with.

In [24]:
# Basic math with integer literals:
print("Addition:", 2 + 5)
print("Multiplication:", 2 * 5)
print("Subtraction:", 2 - 5)
print("Division:", 4 / 2)

Addition: 7
Multiplication: 10
Subtraction: -3
Division: 2.0


Notice that while adding, multiplying, and subtracting integers returns an integer back, DIVISION by integers always returns a float, even if that float could've been represented as an an integer! (Like 2.0, in this case.)

We'll see why this is important in a minute.

In [25]:
# Basic math with float literals:
print("Addition:", 2.5 + 4.5)
print("Multiplication:", 2.0 * 5.0)
print("Subtraction:", 2.0 - 5.0)
print("Division:",4.5 / 1.5)

Addition: 7.0
Multiplication: 10.0
Subtraction: -3.0
Division: 3.0


All of these operations between two floats return floats.

Some additional (surprisingly useful) mathematical opeartors:

In [26]:
print("Exponentiation: 3 to the power of 4 is", 3 ** 4)
print("Modulo, or remainder: remainder of 4 / 3 is", 4 % 3)
print("Floor, or integer division: 7/2, rounded DOWN to the nearest integer, is", 7//2)

Exponentiation: 3 to the power of 4 is 81
Modulo, or remainder: remainder of 4 / 3 is 1
Floor, or integer division: 7/2, rounded DOWN to the nearest integer, is 3


Python is smart enough to apply PEMDAS within individual lines.

Including parentheses, when they're used!

In [27]:
print(2+3*5)
print((2+3)*5)

17
25


Additionally, using int() and float() we can convert back and forth bewteen these types.

Note that converting a float to an int will always round it **down** to the nearest whole number. This is not the typical rounding behavior you'd expect!

In [28]:
print(float(2))

print(int(3.99999999999))

2.0
3


## C: Booleans (type: bool)

The boolean datatype is interesting in that it only has two possible values, True and False.

In python, booleans are expressed with capital letters. Using all lowercase will cause an error.

In [32]:
our_first_boolean = True
our_second_boolean = False

In [33]:
print(type(our_first_boolean))

<class 'bool'>


In [34]:
# Capitalization matters!
true = True
failed_boolean = true

Booleans most commonly arise as a result of comparing two other variables or data points.

To do so, we can use the following operators:
- == (to determine equality — NOT =!)
- \> or >=
- < or <=
- != (not equals)

In [35]:
# These return booleans:
print("hello" == "Hello")
print(2 >= 1.98)
var1 = True
var2 = False
print(var1 == var2)

False
True
False


# Type coercion

...and a bunch of common pitfalls in python coding and data manipulation!


Python is pretty good at converting data from one type to the appropriate value of another type. This is especially useful in cases where we have, say, a string representation where we really want a number.

In [38]:
# This won't give us what we want...
print('2' + '6')
print('25'* 4)

26
25252525


In [39]:
 # ...so we can cast these strings to numeric types, like int!
print(int('2') + int('6'))

8


In [40]:
# Same with floats, where appropriate.

# Instead of this:
print('2.7' + '3.3')

# ...do this.
print(float("2.7") + float("3.3"))

2.73.3
6.0


This can also go back the other way, from an int or float to a string.

This is especially helpful when you need to concatenate numbers to strings, like below:

In [41]:
phil_age = 30
phil_street = "Maple Drive"
phil_house_number = 1234

In [42]:
# This will throw an error, since python doesn't automatically convert the numbers to strings.
print("Phil is " + phil_age + " years old and lives on " + phil_house_number + phil_street)

TypeError: can only concatenate str (not "int") to str

That error message was pretty informative! Let's fix it through casting the numbers to string with str(number):

In [43]:
# This will throw an error, since python doesn't automatically conver the numbers to strings.
print("Phil is " + str(phil_age) + " years old and lives on " + str(phil_house_number) +" "+ phil_street)

Phil is 30 years old and lives on 1234 Maple Drive


Booleans also have the numerical and string equivalents that you might expect.

In [44]:
print(int(True)) #1 for true
print(int(False)) #0 for false

print(str(True))
print(str(False))

1
0
True
False


In [45]:
print((True + True + True)*2 + False )

6
