<a href="https://colab.research.google.com/github/carlomusolino/Python_Intro/blob/main/Lesson_2_Data_Types.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Data Types Revisited** 

We'll now take a closer look at all the possible data types in Python so that we can stop worrying about them once and for all.

## **Identifiers**

An identifier is simply the name we give to a certain variable. For example if we were to type into Python ``a=1``we'd be defining a variable of type `` int``and giving it the identifier ``a``.
There are rules which establish what we can and can't use as a valid identifier in Python. Essentially we want a sequence of alphanumerical characters (we can actually use all UTF-8 characters but please, don't) where the first must be a letter or an underscore. It is generally a bad idea to start the name of a variable by an underscore since these names are often reserved by Python to indicate special functions pertaining to classes (we'll cover them later on). Another rule is that variable identifiers cannot match any of Python's protected keywords: such as ``if``, ``for``,``break``, ``return``, and so on.

## **Numerical data**

We'll now give a comprehensive list of the basic built-in mathematical operators that Python offers, and specify those that can only be used on integers:



*   ``x+y``addition of x and y
*   ``x-y``change of sign/subtraction of y from x
*   ``x*y``multiplication of x and y
*   ``x**y`` x to the power of y, same as ``pow(x,y```
*   ``x/y``  x divided by y
*   ``x//y`` integer division (returns an int)
*   ``x % y`` x modulo y (remainder of x // y, only for ``int``-type)
*   ``abs(x)``absolute value of x
*   ``x+=y`` is shorthand for ``x = x+y```
*   ``x-=y`` 
*   ``x*=y```
*   ``x/=y``


## **Strings**
Strings offer us a practical way of transitioning from basic data-types to collections of data. In fact, to the eyes of Python a string is nothing but a sequence or list of characters! Let's start with the basics. A string in Python is any sequence of UTF-8 characters delimited by single or double quotation marks:



In [None]:
a  = "This is a string!"
b  = 'This is also a string!' 
c  = " even 'this' is a string!"

The only important thing is that the two delimiters must be the same. We can access single characters within a string as we would elements in a list. For example:

In [None]:
len(a)

17

In [None]:
a[0], a[2], a[5]

('T', 'i', 'i')

Now we introduce some indexing techniques that are often very useful when dealing with aggregate data. All lists in python (as well as arrays, which we will encounter later on) can be accessed from the start, using increasing indices starting from 0, or from the end, using decreasing indices starting from -1. So:

In [None]:
a[16],a[-1]

('!', '!')

We can also access sections of a list via index **slicing**, the general syntax going as follows:
```
list[start:end:step]
```
We can omit any of those, and they default to the obvious values: ``0``for start, ``-1``for end, ``1``for step
So for example:

In [None]:
a[0:4:1]

'This'

In [None]:
a[0:4]

'This'

In [None]:
a[:4]

'This'

In [None]:
a[0::2]

'Ti sasrn!'

In [None]:
a[::2]

'Ti sasrn!'

Experiment with these, it takes some getting used to.

# **Collection Data Types**

We'll now discuss data types which consist of a collection of data elements of different classes. We'll start by briefly going back to lists and tuples to then move on to a new, very important, Python data type: the Dictionary.

# **Sequences**

Sequence types are all of those data classes that support the ``len()``method, are ``iterable``(we'll see shortly what this exactly means) and can be sliced using the ``[]``operator. These are essentially  ``str``,``tuples``and ``lists``(plus another couple variants of these that are not very commonly used, being ``bytearray``and ``bytes``. As we've already discussed, a ``tuple``is a sequence of data that cannot be modified once it is defined, the sintax for defining a ``tuple``is as follows:

In [None]:
a = 12
mytuple = (1,a,"hello!!")


Lists are very similar to tuples, but unlike those, we can modify a list once it has been created, meaning we can replace, add, or remove items from it. Lists and tuples' elements can be accessed via the slicing operator ``[]``exactly in the same way as strings' characters. Here's a list of useful methods which apply to lists: (in all that follows, ``myList`` is a ``list``type object)


*   ``myList.append(x)``appends ``x``at the end of the list
*   ``myList.count(x)``counts the number of occurrences of the element ``x``in the list
*   ``myList.remove(x)``removes the earliest occurence of item ``x``from the list
*   ``myList.pop(i)``returns and removes the element at index ``i``
*   ``myList.insert(i,x) ``inserts item ``x``at index i in the list





To refresh what was already covered last time, let us see how we can access items in a list by using a ``for``loop. If we just need to use the items in the list one after the other in the cycle, the easiest way to do this would be:

In [1]:
mylist = ["foo", "bar", 128]
for item in mylist:
  print (item, "is part of my list!")

foo is part of my list!
bar is part of my list!
128 is part of my list!


But often times it may be more convenient to acces items by indices, especially when dealing with large amounts of numerical data. Let's say now we want to take a list of integers and increase them by one. We could do this via:

In [4]:
mylist = [1,2,32,42]
for i in range(len(mylist)):
  mylist[i] += 1 
print(mylist)

[2, 3, 33, 43]


Now we come to a somewhat more advanced way of dealing with lists in Python. We said at the beginning of the section that lists are part of the **iterable** class, but we didn't really explain what this implies. We'll come across them a few times throughtout the remainder of this chapter, but for now suffice to say that iterables come with certain method attributes which can be used to, you guessed it, iterate through the items they contain. Now, say we wish to create a list with all leap years between $1994$ and $2021$. The traditional way of doing this would go as follows:

In [6]:
leap_years = []
for year in range(1994,2021):
  if ( year % 4 == 0 and year % 100 !=0) or (year % 400 == 0):
    leap_years.append(year)

print(leap_years)

[1996, 2000, 2004, 2008, 2012, 2016, 2020]


Now we'll do this again, but smarter. To improve on our code and make it more readable and faster to write, we introduce the syntax of **list comprehensions**. A list comprehension is just a way of writing a for loop (with an optional extra condition) so that it directly fills an empty list. The easiest form of a list comprehension looks like this:
```
[ item for item in iterable]
```
Let's see this in an example:

In [7]:
mylist = [ n for n in range(10)]
print(mylist)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


Now this looks fancy but its usefulness is somewhat questionable. What makes list comprehensions very clever is that we can add a condition at the end:
```
[expression(item) for item in iterable if condition]
```
Again, let's see this in action by rewriting the leap year script:

In [8]:
mylist = [year for year in range(1994,2021) if (year%4 == 0 and year%100!=0) or (year%400==0)]
print(mylist)

[1996, 2000, 2004, 2008, 2012, 2016, 2020]


We can even use this to substitute nested ``for``loops... 

## **Dictionaries**

The last class of data we'll be looking at is Python dictionaries. These are extremely useful in a variety of situations once one gets the hang of them, and also serve as an excellent way to begin understanding the Object Oriented approach that Python takes to programming. A ``dict``is an *unordered* collection of **key-value** pairs. You can think of it as a list where indices aren't just integer numbers but **keys**. So what is a key? Well it can be pretty much anything but most commonly it's just a string. So say for example you're a teacher and would like to keep record of your students marks in their home assignments. You could use a list:

In [9]:
grades = [["A","B","B+"],["F","C","B-"],["A+","A+","A+"]]

But then you'd need to remember exactly in what order you entered the students' grades in the list! Not very practical. We then use a dictionary to do the job:

In [11]:
grades_dict = { "Rob":["A","B","B+"],
           "Bob": ["F","C","B-"],
          "Julia": ["A+","A+","A+"]}

Now we could acces each student's grades simply via:

In [12]:
print(grades_dict["Bob"])

['F', 'C', 'B-']


We can also access the values and/or  keys separately as iterable objects:

In [20]:
mydict = dict()
mydict[1] = "foo"
mydict["bar"] = "F"
mydict["foobar"] = "foobar"

print(mydict.keys(),mydict.values())

dict_keys([1, 'bar', 'foobar']) dict_values(['foo', 'F', 'foobar'])


In [21]:
for key in mydict.keys():
  print(key)

1
bar
foobar


In [23]:
for value in mydict.values():
  print(value)

foo
F
foobar


Actually, the dictionary itself can be iterated over, and that produces the same result of iterating over the keys:

In [24]:
for key in mydict:
  print(key)

1
bar
foobar


Another useful method is  ``dict_name.items()`` which returns an iterable object running over all ``(key,value)``pairs:

In [25]:
for item in mydict.items():
  print(item[0],item[1])

1 foo
bar F
foobar foobar
