# Data Structures and Processing

## Week 1: Introduction to Python

## Name / Variable: First Encounter

In many of the programming languages, it is unnecceary to declare the type of an object before declaring it. In addition to that if a name is already in use for an object of certain type, it could be used for an object of another type.  This phenomemon is rephrased as "dynamically typed language".

Let us consider an example.  We reserve the name `x` for the variable and assign it first and integer and then assign it a string.

We are using the `print`function to print the output of a certain evaluation.  Otherwise, the last evaluated result is printed.

In [1]:
x = 2                        # x is of type int
tx = type(x)
print("The value of x: ", x)
print("The type of x: ", tx)

x = "Data Structures and Processing"
tx = type(x)
print("The value of x: ", x)
print("The type of x: ", tx)

The value of x:  2
The type of x:  <class 'int'>
The value of x:  Data Structures and Processing
The type of x:  <class 'str'>


### Task

Choose a variable name.  Assign it an object of your choice: `int`, `str`, `list`,....  Print your choosen value and its type.  Now assign the same variable a different object and do the same (printing value and its type).

In [6]:
my_variable = 10
print(my_variable, type(my_variable))

my_string = "Hello, world!"
print(my_string, type(my_string))

10 <class 'int'>
Hello, world! <class 'str'>


## Attributes

Recall that objects have attributes. Some of them are callable, and other return some information about the associated object directly.  A helpful function to list all the attributes of an object of a certain class is `dir`.  We use use `dir(x)`, where `x`is the name of the variable whose attributes we want to list, or if we know the class name, we simply type `dir(class_name)`.

Let us take an example.

We assign a float to variable `x` and list its attributes.

In [2]:
x = 3.1415
dir(x)

['__abs__',
 '__add__',
 '__bool__',
 '__ceil__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floor__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getformat__',
 '__getnewargs__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__le__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rmod__',
 '__rmul__',
 '__round__',
 '__rpow__',
 '__rsub__',
 '__rtruediv__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__trunc__',
 'as_integer_ratio',
 'conjugate',
 'fromhex',
 'hex',
 'imag',
 'is_integer',
 'real']

Let us say at this point, we have done some work and we are not sure if our variable has integer stored in it or a float.  We see in the above list that there is an attribute `is_integer`.  That sounds promising for our purpose.  So we use it as follows.

In [3]:
x.is_integer

<function float.is_integer()>

The output of above call is not what we wanted.  A Python user of some experience would know what to do at this point.  But we assume that we do not know.  The next step would be to get some help.  Help is called as follows.

In [4]:
help(x.is_integer)

Help on built-in function is_integer:

is_integer() method of builtins.float instance
    Return True if the float is an integer.



The help suggest that we use it with parenthesis at the end. It is a callable method.  Now let us try one more time.

In [5]:
x.is_integer()  # It should return False.

False

### Task

Choose an object of certain type, and expolore attributes associated to its class.  Get help on the use with `help`function.

In [7]:
help(str)

Help on class str in module builtins:

class str(object)
 |  str(object='') -> str
 |  str(bytes_or_buffer[, encoding[, errors]]) -> str
 |  
 |  Create a new string object from the given object. If encoding or
 |  errors is specified, then the object must expose a data buffer
 |  that will be decoded using the given encoding and error handler.
 |  Otherwise, returns the result of object.__str__() (if defined)
 |  or repr(object).
 |  encoding defaults to sys.getdefaultencoding().
 |  errors defaults to 'strict'.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __format__(self, format_spec, /)
 |      Return a formatted version of the string as described by format_spec.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  

## Boolean and Other Special Types

Recall that we have covered Boolean and the object `None`.  There are many other special types, and we shall explore them as we move forward. For now, let us recall our knowledge about boolean and its means of combinations.

Let us fix `x=True` and `y=False`, and determine the meaning of different combinations.

In [6]:
x = True
y = False

xandy = x and y
xory = x or y

print("Value of x and y is: ", xandy)
print("Value of x or y is: ", xory)

Value of x and y is:  False
Value of x or y is:  True


### Task

Recall the De Morgan's law that $(A+B)^\prime=A^\prime \cdot B^\prime$.  Here $A^\prime$ is a mathematical notation for the complement of the statement $A$.

Assign values to $A$ and $B$ and print the result of the left hand side and the right hand side of the De Morgan's law on separate lines.

It is advised that you first try to guess the outcomes before actually evaluating the block to test your understanding of the boolean and their means of combinations.

In [8]:
# Assign values to A and B
A = True
B = False

# Calculate the left hand side of De Morgan's law: (A + B)'
left_hand_side = not (A or B)

# Calculate the right hand side of De Morgan's law: A' * B'
right_hand_side = (not A) and (not B)

# Print the results
print("Left hand side:", left_hand_side)
print("Right hand side:", right_hand_side)

# Output
# Left hand side: False
# Right hand side: False

Left hand side: False
Right hand side: False


## Numbers: Integers, Floats, Complex Numbers

Recall your knowledge about numbers in Python.

### Task

1. Choose `x` to be a float and make sure that it has the decimal part.  To make sure that a float is a "pure" float or that it has a decimal part, `is_integer` is helpful.  Apply the function `int` on `x` and observe the result.  Explain it to yourself the reason behind the result.

2. Choose `x` to be the string "2.3" now.  Apply `int` now and observe the result.  Warning: An error will be raised this time.  Explain it to yourself the reason for the error.

### Task

Let us choose two floats `x=1.2` and `y=2.2`.  If we add the two numbers, we obtain a result different than 3.4.  See below.

In [7]:
x=1.2
y=2.2
z = x+y
z

3.4000000000000004

The type `float` has an attribute `hex` that might be helpful in explainging the reason behind not obtaining 3.4. Let us use it as follows

In [8]:
z.hex()

'0x1.b333333333334p+1'

The returned result is a `hex` (in base 16) representation of the sum of `x` and `y`.  It might be cryptic at this point, but let us see how we can manually obtain the number in decimal from it. On the left hand side of the decimal in `0x1.b333333333334p+1`, we have the integer 1, and on the right hand side, we have `b`(stands for 11), and then many 3.  At the end we have the exponent `+1`.  The individual integers on the left side of the decimal are multiplied with nonnegative powers of 16 (starting with the power 0 and increasing towards left), and those on the right hand side of the decimal are multiplied with the negative powers of 16 (starting with -1 and decreasing towards right).  All the multiplied factors are then summed and the resulting sum is multiplied with 16 raised to the power the exponent in `0x1.b333333333334p+1`.

In the following cell, perform the above computation.

### Task

Consider complex number `x=1.2+2.3j`. Find them using`abs`, `real` and `imag`.

1. absolute value of `x`.
2. obtain real part of `x`.
3. obtain real part of `x`.

In [9]:
# Complex number
x = 1.2 + 2.3j

# Absolute value of x
absl_value = abs(x)

# OReal part of x
real_x = x.real

# Obtain the imaginary part of x
img_part = x.imag

# Print the results
print("Absolute value of x:", absl_value)
print("Real part of x:", real_x)
print("Imaginary part of x:", img_part)


Absolute value of x: 2.5942243542145693
Real part of x: 1.2
Imaginary part of x: 2.3


### Task

We want to assign the complex number with real part 1.2 and imaginary part 2.3 to a variable `x`.  Which of the following does not give error?  Try one your console or the evaluation block below.

1. `x= 1.2 + j2.3`
2. `x= 1.2 + 2.3 j`
3. `x= 1.2 + 2.3j`

In [10]:
# Assign the CORRECT complex number
x = 1.2 + 2.3j

# Print x
print(x)


(1.2+2.3j)


## Strings

### Task

We want to prepare an email.  Here is a way using strings.

In [9]:
e_greeting="Dear "
e_to= "John"
newline ="""
"""
e_text = "Thanks for approaching. I'll get back to you as soon as possible after solving this task."
e_end = "Best regards,"
e_from = "XYZ"

email_body = e_greeting + e_to + 2*newline + e_text + 2*newline + e_end + 2*newline + e_from

print(email_body)

Dear John

Thanks for approaching. I'll get back to you as soon as possible after solving this task.

Best regards,

XYZ


Your task is to

1. change the name from `"John"` to `"Mike"`
2. remove the substring "Thanks for approaching".
3. change the `XYZ` to some name.
4. print the email using `print` function while making sure that there are two new lines between lines except between the last two lines.

In [11]:
e_greeting = "Dear "
e_to = "Mike"
newline = "\n\n"
e_text = "I'll get back to you as soon as possible after solving this task."
e_end = "Best regards,"
e_from = "ABC"

email_body = e_greeting + e_to + 2 * newline + e_text + 2 * newline + e_end + newline + e_from

print(email_body)

Dear Mike



I'll get back to you as soon as possible after solving this task.



Best regards,

ABC


## Lists and Tuples


### Task

Let us consider the sentence "This sentence is false", and assign it as a string to a variable `x`.  After that, we split `x` using a string method and notice that a list is returned.  See below

In [10]:
x = "This sentence is false."
l = x.split(" ")
l.reverse()
l

['false.', 'is', 'sentence', 'This']

Your task is to
1. reverse the list using the callable method `reverse`. See it in `dir(list)`.
2. change the first element of the list to capitalized version and remove . (period) from it.
3. make the last element in the list a lower case word.
4. use `" ".join(l)` to join the list `l` into a single string.
5. print the result.

In [12]:
x = "This sentence is false."
l = x.split(" ")

# Reverse the list
l.reverse()

# Capitalize->first element & remove the "."
l[0] = l[0].capitalize().rstrip('.')

# Last element -> lowercase
l[-1] = l[-1].lower()

# Join the list into a single string
result = " ".join(l)

print(result)


False is sentence this


### Task

Recall thatlists are mutable objects, but tuples are not.  This means that the elements within a list can be changed but the elements in a tuples cannot be changed.

Consider now `x = (1,2,[3,4])`.  Is it possible to change 4 to something else?  Why or why not?

In [15]:
'''
By definition: 
Tuples are immutable, meaning once they are created, their elements cannot be changed, added, or removed.
However, the list inside the tupple can be changed because it is its own object.
'''
x = (1, 2, [3, 4])

# Modify the list inside the tupple
x[2][1] = 33

print(x)


(1, 2, [3, 33])


## Dictionaries

### Task

Let us make a small database.  You can use it for personal use.

Let `biblio` be a list of books and articles that you are interested in.  At the beginning the list is empty.  We intend to put in this list dictionaries.  The dictionaries are actual records.  See below for an example.

The key `authors` is a list of authors.  This is practical because some items might have multiple authors.


In [11]:
biblio = []                      # empty bibliography, for now

biblio = [ {"id": "skiena2017data",
            "title": "The data science design manual",
            "authors": ["Steven Skiena"],
            "year": 2017,
            "publisher": "Springer"}
         ]

Your task is to 
1. add a new record to `biblio` of your choice.
2. add a new record to `biblio` with multiple authors.
3. remove the record with `id` `skiena2017data`.
4. print the record that you have added in task 2 above in the format "[authors]: [title], [publisher], [year]."

In [16]:
# Define the initial biblio list
biblio = []

# Task 1: Add a new record to biblio
biblio.append({
    "id": "cyan2004methods",
    "title": "Typical Topics of Technology",
    "authors": ["Edison Newark"],
    "year": 2004,
    "publisher": "Caltech"
})

# Task 2: Add a new record to biblio with multiple authors
biblio.append({
    "id": "tuesday2014wc",
    "title": "Ultra-memory",
    "authors": ["John Brzenk", "Devon Larratt", "Jason Todd", "Joseph Tribbiani"],
    "year": 2014,
    "publisher": "Harvard Press"
})

# Task 3: Remove the record with id skiena2017data
for item in biblio:
    if item["id"] == "skiena2017data":
        biblio.remove(item)
        break  # Exit loop after removing the record

# Task 4: Print the record added in Task 2
for item in biblio:
    if item["id"] == "tuesday2014wc":
        authors = ", ".join(item["authors"])
        print(f"{authors}: {item['title']}, {item['publisher']}, {item['year']}")
        break  
        # Exit loop after printing the record


John Brzenk, Devon Larratt, Jason Todd, Joseph Tribbiani: Ultra-memory, Harvard Press, 2014


# Final Remark

There are a lot of subtopics that we are not covering.  We shall come back to them as we progress and use them implicitely. For now, it is a good point to finish practice for the first week.  You are encouraged to explore the topics or subtopics in builtin data structures you found interesting to see what you can build with them.