### Python Data Types

This tutorial covers the following topics:

* Storing information using variables
* Primitive data types in Python: Integer, Float, Boolean, None and String
* Built-in data structures in Python: List, Tuple and Dictionary
* Methods and operators supported by built-in data types

### Storing information using variables

In [1]:
my_favorite_color = "raspberry"

In [2]:
my_favorite_color

'raspberry'

In [3]:
color_code = '#E30B5C'

In [4]:
color_code

'#E30B5C'

Variable names can be short (a, x, y, etc.) or descriptive ( my_favorite_color, profit_margin, the_3_musketeers, etc.). However, you must follow these rules while naming Python variables:

A variable's name must start with a letter or the underscore character _. It cannot begin with a number.
A variable name can only contain lowercase (small) or uppercase (capital) letters, digits, or underscores (a-z, A-Z, 0-9, and _).
Variable names are case-sensitive, i.e., a_variable, A_Variable, and A_VARIABLE are all different variables.

In [5]:
a variable = 23

SyntaxError: invalid syntax (605469086.py, line 1)

In [6]:
a_variable = 23

In [7]:
is_today_$aturday = False

SyntaxError: invalid syntax (3433388187.py, line 1)

In [8]:
my-favorite-car = "Volvo"

SyntaxError: cannot assign to expression here. Maybe you meant '==' instead of '='? (1615860382.py, line 1)

In [9]:
3_musketeers = ['Athos', 'Porthos', 'Aramis']

SyntaxError: invalid decimal literal (1028151984.py, line 1)

### Built-in data types in Python

Any data or information stored within a Python variable has a type. You can view the type of data stored within a variable using the type function.

In [10]:
type(my_favorite_color)

str

In [11]:
type(a_variable)

int

Python has several built-in data types for storing different kinds of information in variables. Following are some commonly used data types:

* Integer
* Float
* Boolean
* None
* String
* List
* Tuple
* Dictionary
<br><br>
Integer, float, boolean, None, and string are **primitive data types** because they represent a single value. Other data types like list, tuple, and dictionary are often called **data structures or containers** because they hold multiple pieces of data together.

In [12]:
current_year = 2022

In [13]:
type(current_year)

int

In [14]:
float(current_year)

2022.0

In [15]:
int(current_year)

2022

You can convert floats into integers and vice versa using the float and int functions. The operation of converting one type of value into another is called casting.

Booleans represent one of 2 values: True and False. Booleans have the type bool. Any value in Python can be converted to a Boolean using the bool function.

Only the following values evaluate to False (they are often called *falsy values*):

* The value False itself
* The integer 0
* The float 0.0
* The empty value None
* The empty text ""
* The empty list []
* The empty tuple ()
* The empty dictionary {}
* The empty set set()
* The empty range range(0)
<br><br>Everything else evaluates to True (a value that evaluates to True is often called a *truthy value*).

In [16]:
bool(False)

False

In [17]:
bool(0)

False

In [18]:
type(0)

int

In [19]:
bool(True), bool(1), bool(2.0), bool("hello"), bool([1,2]), bool((2,3)), bool(range(10))

(True, True, True, True, True, True, True)

**None**<br>
The None type includes a single value None, used to indicate the absence of a value. None has the type NoneType. It is often used to declare a variable whose value may be assigned later.

In [20]:
nothing = None

In [21]:
type(nothing)

NoneType

A string is used to represent text (a string of characters) in Python. Strings must be surrounded using quotations (either the single quote ' or the double quote "). Strings have the type str. To use a double quote within a string written with double quotes, *escape* the inner quotes by prefixing them with the `\` character.

In [22]:
text = "ove je \"text\" a ovo je novi red \n i tekst u novom redu i dodajem tab \t sa novim tekstom \n novi red i u njemu stampam \\n"

In [23]:
# use """ for multiline strings """
text = """ove je \"text\" a ovo je novi red \n i tekst u novom redu i dodajem tab \t sa novim 
tekstom \n novi red i u njemu stampam \\n"""

In [24]:
print(text)

ove je "text" a ovo je novi red 
 i tekst u novom redu i dodajem tab 	 sa novim 
tekstom 
 novi red i u njemu stampam \n


Note that special characters like \n and escaped characters like \" count as a single character, even though they are written and sometimes printed as two characters.

In [25]:
multiline_string = """a
b"""
multiline_string

'a\nb'

In [26]:
len(multiline_string)

3

In [27]:
list(multiline_string)

['a', '\n', 'b']

In [28]:
today = "Saturday"

In [29]:
today[7]

'y'

In [30]:
today[5:8]

'day'

In [31]:
'day' in today

True

Strings in Python have many built-in *methods* that are used to manipulate them. Let's try out some common string methods.

> **Methods**: Methods are functions associated with data types and are accessed using the `.` notation e.g. `variable_name.method()` or `"a string".method()`. Methods are a powerful technique for associating common operations with values of specific data types.

The `.lower()`, `.upper()` and `.capitalize()` methods are used to change the case of the characters.

In [32]:
today.lower()

'saturday'

In [33]:
"saturday".upper()

'SATURDAY'

In [34]:
"monday".capitalize() # changes first character to uppercase

'Monday'

The `.replace` method replaces a part of the string with another string. It takes the portion to be replaced and the replacement text as inputs or arguments.

In [35]:
another_day = today.replace("Satur", "Wednes")

In [36]:
another_day

'Wednesday'

Note that replace returns a new string, and the original string is not modified.

The `.split` method splits a string into a list of strings at every occurrence of provided character(s).

In [37]:
"Sun,Mon,Tue,Wed,Thu,Fri,Sat".split(",")

['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']

The .strip method removes whitespace characters from the beginning and end of a string.

In [38]:
a_long_line = "       This is a long line with some space before, after,     and some space in the middle..    "

In [39]:
a_long_line_stripped = a_long_line.strip()

In [40]:
a_long_line_stripped

'This is a long line with some space before, after,     and some space in the middle..'

The `.format` method combines values of other data types, e.g., integers, floats, booleans, lists, etc. with strings. You can use `.format` to construct output messages for display.

In [41]:
# Input variables
cost_of_ice_bag = 1.25
profit_margin = .2
number_of_bags = 500

# Template for output message
output_template = """If a grocery store sells ice bags at $ {} per bag, with a profit margin of {} %, 
then the total profit it makes by selling {} ice bags is $ {}."""

print(output_template)

If a grocery store sells ice bags at $ {} per bag, with a profit margin of {} %, 
then the total profit it makes by selling {} ice bags is $ {}.


Notice how the placeholders `{}` in the `output_template` string are replaced with the arguments provided to the `.format` method.

It is also possible to use the string concatenation operator `+` to combine strings with other values. However, those values must first be converted to strings using the `str` function.

In [42]:
# Inserting values into the string
total_profit = cost_of_ice_bag * profit_margin * number_of_bags
output_message = output_template.format(cost_of_ice_bag, profit_margin*100, number_of_bags, total_profit)

print(output_message)

If a grocery store sells ice bags at $ 1.25 per bag, with a profit margin of 20.0 %, 
then the total profit it makes by selling 500 ice bags is $ 125.0.


**List**

A list in Python is an ordered collection of values. Lists can hold values of different data types and support operations to add, remove, and change values. Lists have the type list.

To create a list, enclose a sequence of values within square brackets [ and ], separated by commas.

In [43]:
fruits = ['apple', 'banana', 'cherry']

In [44]:
a_list = [23, 'hello', None, 3.14, fruits, 3 <= 5]

In [45]:
type(fruits)

list

In [46]:
empty_list = []

In [47]:
empty_list

[]

In [48]:
type(empty_list)

list

In [49]:
print("Number of fruits:", len(fruits))

Number of fruits: 3


In [50]:
fruits[2]

'cherry'

In [51]:
fruits[-3]

'apple'

In [52]:
a_list = [23, 'hello', None, 3.14, fruits, 3 <= 5]

In [53]:
a_list[2:5]

[None, 3.14, ['apple', 'banana', 'cherry']]

In [54]:
a_list[0:10]

[23, 'hello', None, 3.14, ['apple', 'banana', 'cherry'], True]

In [55]:
a_list[12:10]

[]

In [56]:
a_list[3:]

[3.14, ['apple', 'banana', 'cherry'], True]

In [57]:
a_list[-2:-5]

[]

In [58]:
a_list[-5:-2]

['hello', None, 3.14]

In [59]:
fruits.append('dates')

In [60]:
fruits

['apple', 'banana', 'cherry', 'dates']

In [61]:
fruits.insert(2, 'blueberry')

In [62]:
fruits

['apple', 'banana', 'blueberry', 'cherry', 'dates']

In [63]:
fruits.remove('blueberry')

In [64]:
fruits

['apple', 'banana', 'cherry', 'dates']

In [65]:
fruits.remove('banana', 'dates')

TypeError: list.remove() takes exactly one argument (2 given)

In [66]:
fruits.pop(1) # If no index is provided, the pop method removes the last element of the list.

'banana'

In [67]:
fruits

['apple', 'cherry', 'dates']

In [68]:
'pineapple' in fruits

False

In [69]:
more_fruits = fruits + ['pineapple', 'tomato', 'guava'] + ['avocado', 'banana']

In [70]:
more_fruits_copy = more_fruits.copy()

In [71]:
more_fruits_copy

['apple',
 'cherry',
 'dates',
 'pineapple',
 'tomato',
 'guava',
 'avocado',
 'banana']

In [72]:
# Modify the copy
more_fruits_copy.remove('pineapple')
more_fruits_copy.pop()
more_fruits_copy

['apple', 'cherry', 'dates', 'tomato', 'guava', 'avocado']

Note that you cannot create a copy of a list by simply creating a new variable using the assignment operator `=`. The new variable will point to the same list, and any modifications performed using either variable will affect the other.

In [73]:
# Reverse the order of elements in a list
more_fruits_copy.reverse()
more_fruits_copy

['avocado', 'guava', 'tomato', 'dates', 'cherry', 'apple']

In [74]:
# Add the elements of one list at the end of another list
more_fruits_copy.append(fruits[0])
more_fruits_copy

['avocado', 'guava', 'tomato', 'dates', 'cherry', 'apple', 'apple']

In [75]:
# Sort a list of strings in alphabetical order
more_fruits_copy.sort()

In [76]:
more_fruits_copy

['apple', 'apple', 'avocado', 'cherry', 'dates', 'guava', 'tomato']

In [78]:
# Sort a list of numbers in decreasing order
numbers = [1, 2, 5, 62, 72, 43, 25, 34, 0]
numbers.sort(reverse=True)

In [79]:
numbers

[72, 62, 43, 34, 25, 5, 2, 1, 0]

**Tuple**

A tuple is an ordered collection of values, similar to a list. However, it is not possible to add, remove, or modify values in a tuple. A tuple is created by enclosing values within parentheses ( and ), separated by commas.

Any data structure that cannot be modified after creation is called immutable. You can think of tuples as immutable lists.

Let's try some experiments with tuples.

In [80]:
fruits = ('apple', 'cherry', 'dates')

In [81]:
# try to change an element
fruits[0] = 'avocado'

TypeError: 'tuple' object does not support item assignment

In [82]:
fruits.append('blueberry')

AttributeError: 'tuple' object has no attribute 'append'

You can also skip the parantheses ( and ) while creating a tuple. Python automatically converts comma-separated values into a tuple.

In [83]:
the_3_musketeers = 'Athos', 'Porthos', 'Aramis'

In [84]:
the_3_musketeers

('Athos', 'Porthos', 'Aramis')

In [85]:
single_element_tuple = 4,

In [86]:
single_element_tuple

(4,)

In [87]:
another_single_element_tuple = (4,)
another_single_element_tuple

(4,)

In [88]:
not_a_tuple = (4)
not_a_tuple

4

You can convert a list into a tuple using the tuple function, and vice versa using the list function.

In [89]:
tuple(['one', 'two', 'three'])

('one', 'two', 'three')

In [90]:
list(('Athos', 'Porthos', 'Aramis'))

['Athos', 'Porthos', 'Aramis']

Tuples have just two built-in methods: count and index.

In [98]:
a_tuple = 23, "hello", False, None, 23, 37, "hello"

In [99]:
help(a_tuple.count)

Help on built-in function count:

count(value, /) method of builtins.tuple instance
    Return number of occurrences of value.



In [104]:
a_tuple.count(23)

2

In [105]:
a_tuple.index(37)

5

**Dictionary**

A dictionary is an unordered collection of items. Each item stored in a dictionary has a key and value. You can use a key to retrieve the corresponding value from the dictionary.  Dictionaries have the type `dict`.

Dictionaries are often used to store many pieces of information e.g. details about a person, in a single variable. Dictionaries are created by enclosing key-value pairs within braces or curly brackets `{` and `}`.

In [106]:
person1 = {
    'name': 'John Doe',
    'sex': 'Male',
    'age': 32,
    'married': True
}

In [107]:
person2 = dict(name='Jane Judy', sex='Female', age=28, married=False)

In [108]:
person1['name']

'John Doe'

In [110]:
person1['address']

KeyError: 'address'

In [111]:
person1.get("address", "Unknown")

'Unknown'

In [112]:
person1['address'] = 'Penny Lane'

In [113]:
person1.get("address", "Unknown")

'Penny Lane'

The results of `keys`, `values`, and `items` look like lists. However, they don't support the indexing operator `[]` for retrieving elements. 

Can you figure out how to access an element at a specific index from these results? Try it below. *Hint: Use the `list` function*

In [115]:
person1.keys()

dict_keys(['name', 'sex', 'age', 'married', 'address'])

In [116]:
person1.values()

dict_values(['John Doe', 'Male', 32, True, 'Penny Lane'])

In [117]:
person1.items()

dict_items([('name', 'John Doe'), ('sex', 'Male'), ('age', 32), ('married', True), ('address', 'Penny Lane')])

In [118]:
get_name = list(person1.values())
get_name[0]

'John Doe'

In [119]:
person1.clear()

In [120]:
person1

{}

In [122]:
person = person2.copy()
person # this is a copy of dictionary person2

{'name': 'Jane Judy', 'sex': 'Female', 'age': 28, 'married': False}

In [123]:
# Create a dictionary with 3 keys, all with the value 0:

keys = ('key1', 'key2', 'key3') # iterable specifying the keys of the new dictionary
values = 0 # if not defined, default value is None

dictionary = dict.fromkeys(keys, values)

dictionary

{'key1': 0, 'key2': 0, 'key3': 0}

In [125]:
# Returns the value of the specified key:

car = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}

x = car.get("model", 'no model')

print(x)

# return a value of an item that do not exist:

y = car.get("price", 15000)

print(y)

Mustang
15000


In [128]:
# Returns the value of the specified key, If the key does not exist, insert the key, with the specified value.

car = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}

x = car.get("color", "Raspberry")

print(car)

y = car.setdefault("color", "Raspberry")

print(car)

{'brand': 'Ford', 'model': 'Mustang', 'year': 1964}
{'brand': 'Ford', 'model': 'Mustang', 'year': 1964, 'color': 'Raspberry'}


The only difference is that **setdefault() automatically adds any new key with a default value in the dictionary while get() does not.**

In [129]:
# Updates the dictionary with the specified key-value pairs:

car = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}

car.update({"color": "White"})

print(car)

{'brand': 'Ford', 'model': 'Mustang', 'year': 1964, 'color': 'White'}
