# Python Introduction

_Python_ is high-level, interpreted, weakly typed, multi-paradigm, and general purpose programming language.

## Variables in Python

In Python, __variables__ are used to _store_ data values. A __variable__ is created when a value is assigned to it using the _assignment operator_ (=). 

For example, to assign the value $10$ to a _variable_ named `x`, we can use the following code:

```python
x = 10
```

After the assignment, the _variable_ `x` will hold the value $10$, and we can use it in our code to perform operations or manipulate the data.

__Variables__ in Python are _dynamically typed_, which means that the type of a variable is determined at _runtime_ based on the value assigned to it. This allows us to assign different types of values to the same variable.

Python also supports _multiple assignment_, where we can assign values to multiple variables in a single line. For example:

```python
a, b, c = 1, 2, 3
```

In this case, the values $1$, $2$, and $3$ are assigned to _variables_ `a`, `b`, and `c` respectively.

__Variables__ in Python are _case-sensitive_, which means that `x` and `X` are considered as _different_ variables.

Overall, variables in Python are a fundamental concept that allows us to store and manipulate data in our programs.
```
```

In [None]:
# create variables

print("Value of x:", )

# get type
print("Type of x:", )

# get memory reference
print("Memory reference of x:", )

## Conditionals

Conditionals are used to make decisions in a program based on certain conditions.
In _Python_, conditionals are implemented using `if`, `elif`, and `else` statements.

- The `if` statement is used to check a condition and execute a block of code if the condition is __true__.
- The `elif` statement is used to check additional conditions if the previous conditions are __false__.
- The `else` statement is used to execute a block of code if none of the previous conditions are __true__.

__Conditionals__ allow the program to take different paths based on the values of variables or the result of _comparisons_.
They are essential for controlling the _flow of execution_ in a program and making it more dynamic and responsive.

In [None]:
# simple conditional

print("Result:", )

In [None]:
# nested conditional

print("Result:", )

In [None]:
# elif conditional

print("Result:", )

## Loops and Range

### Range

The `range` _function_ in _Python_ generates a __sequence of numbers__ within a specified range. It is commonly used in for loops to iterate over a sequence of numbers. The `range` function can take up to _three arguments_: start, stop, and step. The `start` argument specifies the starting value of the sequence (default is $0$), the `stop` argument specifies the ending value (exclusive), and the `step` argument specifies the increment (default is $1$). The `range` _function_ returns an __iterable object__ that can be converted to a list or used directly in a loop.

### Loops

__Loops__ are control structures that allow us to repeat a block of code multiple times.
In _Python_, there are two types of _loops_: the `for` loop and the `while` loop.

- The `for` loop is used to iterate over a __sequence__ (such as a list, tuple, or string) or other iterable objects. It executes a block of code for _each item_ in the sequence. The loop variable takes on the value of each item in the sequence, _one by one_.
- The `while` loop is used to repeatedly execute a block of code as long as a __certain condition__ is ___true___. It continues to execute the code until the condition becomes ___false___. 
 
Both types of loops can be used to automate repetitive tasks and make the code more efficient.



In [None]:
# range just with stop
print("Range just with stop:", )

# range with start and stop
print("Range with start and stop:", )

# range jumping by odd numbers
print("Range jumping by odd numbers:", )

# range making a countdown
print("Range making a countdown:", )

In [None]:
# while loop


In [None]:
# for loop 


In [None]:
# while loop with brake


In [None]:
# while loop with continue 


## Lists

A `list` is a __collection__ of items that are _ordered_ and _changeable_.

__Lists__ are defined by enclosing items in _square brackets_ [ ] and separating them with commas.

__Lists__ can contain elements of _different data types_, such as integers, strings, or even other lists.

__Lists__ in Python are _mutable_, meaning that you can modify the elements of a list after it is created.


In [None]:
list_elements = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# adding using append

print("List after append:", )


# adding using insert

print("List after insert:", )

In [None]:
# removing with remove
print("List after remove:", )

# removing with pop 
print("List after pop:", )

# removing with pop by index
print("List after pop by index:", )

In [None]:
# a list of lists


# conditionals (postive) with lists


# conditinals (negative) with lists


# traversing the list


## Dictionaries

__Dictionaries__ are a _built-in data structure_ in Python that allow you to store and retrieve data using __key-value pairs__.

A _dictionary_ (`dict`) is created using _curly braces_ {} and populated with key-value pairs.
The __keys__ could be __strings__, numbers or tuples, and the __values__ can be of _any data type_.
The __dictionary__ is then accessed using the _keys_ to __retrieve__ the corresponding _values_.

In [None]:
from pprint import pprint

# a simple dictionary

pprint("Simple dictionary:\n", )

# adding a new key

pprint("After adding a new key:\n", )

# removing a key

pprint("After removing a key:\n", )

In [None]:
# a dictionary of dictionaries

pprint("Dictionary of dictionaries:\n", )

# a list of dictionaries

pprint("List of dictionaries:\n", )

In [None]:
# reading a JSON in python into a dictionary

pprint("Reading JSON into a dictionary:\n", )

In [None]:
# getting keys

print("Getting keys from the dictionary:\n", )

# getting values

print("Getting values from the dictionary:\n", )

# traversal with a loop


## Sets

__Sets__ are _unordered collections_ of _unique elements_. They are commonly used to perform mathematical __set operations__ such as _union_, _intersection_, and _difference_.

To use __sets__ in _Python_, you can create a set by enclosing comma-separated elements within _curly braces_ {}. You can also use the `set()` function to create a __set__ _from_ an _iterable object_.

Note that sets __do not allow__ _duplicate_ elements, so any duplicate elements will be _automatically removed_.

In [None]:
# creating a set

print("Creating a set:\n", )

# creating a set from a list

print("Creating a set from a list:\n", )

In [None]:
# union

print("Union of two sets:\n", )

# intersection

print("Intersection of two sets:\n", )

# difference

print("Difference of two sets:\n", )

# symmetric difference

print("Symmetric difference of two sets:\n", )

## Tuples

__Tuples__ are _immutable sequences_, and similar to _lists_, that can _store multiple items_.

They are defined using _parentheses_ and can contain elements of _different data types_.

Tuples are commonly used to __group related data__ together and can be accessed using _indexing_.

Unlike _lists_, tuples __cannot__ be _modified_ once created, making them useful for storing data that _should not be changed_.

In [None]:
# tuple of different data types

print("Tuple of different data types:\n", )

# accessing by index
print("First element:", )
print("Second element:", )
print("Third element:", )

In [None]:
# create a tuple from a list

print("Tuple created from list:", )

# traversal with a loop and with a conditional


## List Comprehensions

__List comprehensions__ provide a concise way to _create lists_ based on existing lists or other _iterable objects_. It allows you to transform and filter elements from the original iterable in a _single line of code_.

The general syntax of a list comprehension is:
`[expression for item in iterable if condition]`

- `expression` is the value or transformation applied to each item in the iterable.
- `item` is the variable that represents each element in the iterable.
- `iterable` is the original list or other iterable object.
- `condition` (optional) is a filter that determines whether an item should be included in the new list.

__List comprehensions__ are often used as a more readable and concise alternative to traditional for loops when creating _new lists_. They can be used to perform __operations__ such as _filtering_, _mapping_, and _transforming elements_.

_Python_ that can help _simplify code_ and make it more expressive.

In [None]:
# power of 2 since range

print("Power of 2 since range:\n", )

# increase 1 to elements of a list

print("Increased elements of the list:\n", )

# filtering even numbers of a list

print("Filtered even numbers:\n", )

# filtering names with a 'e'

print("Filtered names with 'e':\n", )

## Functions

A __function__ in _Python_ is a block of organized, _reusable code_ that is used to perform a _single_, related _action_. __Functions__ provide better _modularity_ for your application and a high degree of _code reusing_. They are defined using the `def` _keyword_, can accept parameters, and can return results.

### Built-In Functions

A __built-in function__ in _Python_ is a _function_ that is pre-defined and available for use _without_ the need for `import` or definition by the user. These functions are an _integral part_ of the _Python_ language and provide basic functionality, such as converting types, performing mathematical calculations, and interacting with the core parts of the language.

### Variadic Functions

A __variadic function__ in _Python_ is a _function_ that can accept an arbitrary number of arguments. This allows the function to be _flexible_ in the number of values it processes. __Variadic functions__ are defined using the `*args` syntax for _positional_ arguments and `**kwargs` for _keyword_ arguments, enabling them to handle a varying amount of input data gracefully.

In [None]:
# function with no return


print("Function with no return - positive case:\n", )
print("Function with no return - negative case:\n", )

In [None]:
# function with parameters and type hints


In [None]:
# function returning a tuple

print("Function returning a tuple:\n", )

In [None]:
# function with a docstring


In [None]:
# function with variable parameters in a tuple


print("Function with variable parameters in a tuple:\n", )

In [None]:
# function with variable parameters in a dictionary


print("Function with variable parameters in a dictionary:\n", )

In [None]:
# recursive function


print("Recursive function - factorial of 5:\n", )

## Iterators

__Iterators__ are objects that allow iteration over a _collection of elements_. They provide a way to access the elements of a collection _one by one_, without the need to know the underlying structure of the collection.

In _Python_, __iterators__ are implemented using the `iter()` and `next()` methods. When there are no more elements to return, the `next__()` method raises the `StopIteration` _exception_.

__Iterators__ are commonly used in _for loops_ to iterate over elements in a sequence, such as _lists_, _tuples_, or _strings_.

### Map Function

The `map` _function_ in _Python_ applies a given function to each item of an iterable (like a list, tuple, etc.) and returns a __map object__ (which is an _iterator_) of the results. This is often used to perform _some operation_ on a collection of items and _generate a new collection_ containing the results.

### Filter Function 

The `filter` _function_ in _Python_ takes a function and an iterable as arguments and constructs an iterator from those elements of the iterable for which the function _returns_ __true__. Essentially, it filters out the elements in a collection, only _keeping those that satisfy a specific condition_.

In [None]:
# filter a list a dictionaries


print("Filter a list of dictionaries:\n", )

In [None]:
# change a list of numbers in strings formats and apply power of 4


print("List of numbers in string format with power of 4:\n", )

## Lambda Functions

A __lambda function__ in _Python_ is a small _anonymous function_ that can take any number of arguments but can only have _one expression_. It is defined using the `lambda` keyword, followed by a comma-separated list of parameters, a colon, and the expression to be evaluated. __Lambda functions__ are commonly used when a small function is needed for a _short period of time_ and it is not necessary to define a named function. They are often used in combination with higher-order functions like `map()`, `filter()`, and `reduce()`.



In [None]:
# lambda function in a variable
fun = lambda 
result = fun(4)

# lambdas function of two parameters in a variable
fun = lambda 
result = fun(4, 5)

# sort a list of tuples by the second item
pairs = [(1, 2), (3, 1), (5, 8), (7, 4), (9, 5)]
pairs.sort(key = lambda )

# filter a list to include multiple of 3
filtered = list(filter(lambda ))

# create a new list with the cube of each number
cube = list(map(lambda ))

# sort a list of strings based on length
strings = ["hello", "world", "python", "is", "awesome"]
strings.sort(key=lambda )

## Classes

 A `class` is a blueprint for creating objects in Python. It defines a set of _attributes_ and _methods_ that the objects of the class will have. 

In [None]:
# create an abstract class


In [None]:
# create a concrete class


In [None]:
# instanciate a concrete object


# Numpy

__Numpy__ is a _Python package_ that stands for ___Numerical Python___. It is a _library_ for the _Python_ programming language, adding support for large, _multi-dimensional arrays_ and _matrices_, along with a _large collection_ of high-level mathematical functions to operate on these _arrays_.

__Numpy__ provides a powerful _N-dimensional_ array object, useful for performing mathematical and logical operations on arrays. It also has _functions_ for working in domain of _linear algebra_, _Fourier transform_, and _matrices_.

By importing `numpy as np`, we can access all the functions and methods provided by the _numpy_ package using the `np` alias.

In [None]:
import numpy as np

# create an array
array = 
print("Array 1:", array)

# create an array with range
print("Array 2:", )

# create an array with linspace
print("Array 3:", )

# create a matrix of zeros
matrix = )
print("Matrix:", matrix)

In [None]:
# get shape of array
print("Array Shape:", )

# get shape of matrix
print("Matrix Shape:", )

# get number of dimensions of array
print("Array Dimensions:", )

# get number of dimensions of matrix
print("Matrix Dimensions:", )

In [None]:
# ========== descriptive statistics ========== #

# get sum of array
print("Sum of array:", )

# get mean of array
print("Mean of array:", )

# get standard deviation of array
print("Standard Deviation of array:", )

# get variance of array
print("Variance of array:", )

# get min of array
print("Min of array:", )

# get max of array
print("Max of array:", )

## Linear Algebra with Numpy

_Numpy_ is a powerful _Python_ package that provides support for _linear algebra_ operations. It allows us to perform various mathematical operations on _arrays_ and _matrices_ efficiently.

With __Numpy__, we can easily solve _linear algebra_ problems such as finding solutions to systems of linear equations, calculating matrix determinants, eigenvalues, eigenvectors, and much more.

# 

In [None]:
# solve a system of linear equations
equations = np.array([[2, 1], [3, 5]])
answers = np.array([1, 2])

print("Solutions:", )
# to solve a system of linear equations, the number of equations must equal the number of unknowns

In [None]:
# get the inverse of a matrix
matrix = np.array([[1, 2], [3, 4]])
print("Inverse of Matrix:", )

# to get the inverse of a matrix, 
# the matrix must be square (number of rows = number of columns)
# the matrix must be non-singular (determinant is not zero)
# Inversa = [[d/ad-bc, -b/ad-bc], [-c/ad-bc, a/ad-bc]]

In [None]:
# get the determinant of a matrix
print("Determinant of Matrix:", )

# the determinant of a matrix is a scalar value  calculated with next formula
# det(A) = ad - bc

In [None]:
# get the dot product of two arrays
array_1 = np.array([1, 2, 3])
array_2 = np.array([4, 5, 6])

print("Dot Product:", )
# dot = 1*4 + 2*5 + 3*6

In [None]:
# get the cross product of two arrays
print("Cross Product:", )

# cross = [2*6-3*5, 3*4-1*6, 1*5-2*4]

In [None]:
# get the eigenvalues and eigenvectors of a matrix
print("Eigenvalues:", )
print("Eigenvectors:", )

# the eigenvalues of a matrix are the values that satisfy the equation det(A - λI) = 0
# the eigenvectors of a matrix are the vectors that satisfy the equation A*v = λ*v

# eigenvectors are used in principal component analysis (PCA) to reduce the dimensionality of data  

## Vectorization

__Vectorization__ is a technique in computer programming that allows us to perform operations on _entire arrays or matrices_ instead of looping through each element individually. This approach leverages the power of optimized, low-level operations provided by libraries like __NumPy__.

By using _vectorized operations_, we can significantly improve the _performance_ of our code, as it takes advantage of parallel processing capabilities of modern _CPUs_. Instead of writing explicit loops, we can express our computations as _mathematical expressions_ on arrays, making our code more concise and readable.

For example, instead of iterating over each element of an array to calculate the square root, we can simply apply the `np.sqrt()` function to the _entire array_. This not only simplifies the code but also improves its efficiency.

In addition to arithmetic operations, __vectorization__ also enables us to perform logical operations, mathematical functions, and other operations on arrays and matrices. This makes it a powerful tool for _scientific computing_, _data analysis_, and _machine learning_ tasks.

Overall, __vectorization__ is a fundamental concept in array programming that allows us to write efficient and concise code by operating on entire arrays or matrices at once. It is a key technique to leverage the full potential of libraries like __NumPy__ and optimize our computations.

In [None]:
import memory_profiler
import time

# decorator to time a function
def time_function(func):
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(f"Time taken: {end - start}")
        return result
    return wrapper

# decorator to get memory usage of a function
def memory_function(func):
    def wrapper(*args, **kwargs):
        result = memory_profiler.memory_usage((func, args, kwargs))
        print(f"Memory used: {result[0]}")
        return result
    return wrapper

In [None]:
# calculare the square root of a matrix with vertorization
@time_function
@memory_function
def sqrt_vectorized(matrix):
    return np.sqrt(matrix)

# calculate the square root of a matrix with a loop
@time_function
@memory_function
def sqrt_loop(matrix):
    result = np.zeros_like(matrix)
    for i in range(matrix.shape[0]):
        for j in range(matrix.shape[1]):
            result[i, j] = np.sqrt(matrix[i, j])
    return result

size = 1000
matrix = np.random.rand(size, size)
print("Shape:", matrix.shape)
print("Vectorized")
t = sqrt_vectorized(matrix)
print("Loop")
t = sqrt_loop(matrix)

In [None]:
# calculate the sum of two arrays witn vectorization
@time_function
@memory_function
def sum_vectorized(array1, array2):
    return array1 + array2

# calculate the sum of two arrays with a loop
@time_function
@memory_function
def sum_loop(array1, array2):
    result = np.zeros_like(array1)
    for i in range(array1.shape[0]):
        result[i] = array1[i] + array2[i]
    return result

size = 10000000
array1 = np.random.rand(size)
array2 = np.random.rand(size)
print("Shape:", array1.shape)
print("Vectorized")
t = sum_vectorized(array1, array2)
print("Loop")
t = sum_loop(array1, array2)

In [None]:
# calculate broadcasted multiplication of and array and a scalar with vectorization
@time_function
@memory_function
def broadcasted_vectorized(array, scalar):
    return array * scalar

# calculate broadcasted multiplication of and array and a scalar with a loop
@time_function
@memory_function
def broadcasted_loop(array, scalar):
    result = np.zeros_like(array)
    for i in range(array.shape[0]):
        result[i] = array[i] * scalar
    return result

size = 10000000
array = np.random.rand(size)
scalar = np.random.rand(1)
print("Shape:", array.shape)
print("Vectorized")
t = broadcasted_vectorized(array, scalar)
print("Loop")
t = broadcasted_loop(array, scalar)

In [None]:
# filter an array with vectorization
@time_function
@memory_function
def is_even(x):
    return x % 2 == 0

# filter an array with a loop
@time_function
@memory_function
def is_even_loop(array):
    result = []
    for x in array:
        if x % 2 == 0:
            result.append(x)
    return result

# filter an array with a lambda function
@time_function
@memory_function
def is_even_lambda(array):
    return list(filter(lambda x: x % 2 == 0, array))

size = 10000000
array = np.random.randint(0, 100, size)
print("Shape:", array.shape)
print("Vectorized")
t = is_even(array)
print("Loop")
t = is_even_loop(array)
print("Lambda")
t = is_even_lambda(array)

# PANDAS

__Pandas__ is a powerful _open-source_ data manipulation and analysis library for _Python_. 

It provides data structures and functions for efficiently __handling and analyzing structured data__, such as tables or spreadsheets.

With __pandas__, you can easily _load_, _manipulate_, _analyze data_, perform _data cleaning_ and _preprocessing_ tasks, and create _visualizations_.

It is widely used in _data science_, _machine learning_, and _data analysis_ projects.

To import the pandas library and assigns it the alias 'pd', you could make `import pandas as pd`.

## The Series Data Structure

A __pandas Series__ is a _one-dimensional labeled array_ capable of holding any data type. It is similar to a _column_ in a spreadsheet or a SQL table, or a _dictionary-like_ object. It is a fundamental _data structure_ in __pandas__ library, which is widely used for data manipulation and analysis in Python.

A __pandas Series__ consists of two main components: the _data_ and the _index_. The _data_ can be of any type, such as integers, floats, strings, or even complex objects. The _index_ is a sequence of labels that uniquely identifies each element in the Series.

Some key features of pandas Series include:
- Vectorized operations: Series supports vectorized operations, allowing you to perform element-wise computations efficiently.
- Label-based indexing: You can access elements in a Series using labels instead of integer-based indexing.
- Alignment: Series automatically aligns data based on the index, making it easy to perform operations on multiple Series with different indexes.

To create a __Series__, you can pass a list, array, or dictionary-like object to the `pd.Series()` constructor. You can also specify custom index labels if needed.

In [None]:
import pandas as pd

# Create a Series object from a list of strings
list_strings = ['a', 'b', 'c', 'd', 'e']
serie_1 = 
print("Serie 1:", serie_1)

# Create a Series object from a list of numbers
list_numbers = [1, 2, 3, 4, 5]
serie_2 = 
print("Serie 2:", serie_2)

# Create a Series object from a list of numbers with a None value
list_numbers_with_none = [1, 2, None, 4, 5]
serie_3 = 
print("Serie 3:", serie_3)

In [None]:
# Create a Series object from a dictionary
dict_data = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
serie_4 = 
print("Serie 4:", serie_4)

# Get the values of the Series index
print("Serie 4 index:", )

In [None]:
# Create a series object from a list of tuple pairs
list_tuples = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 5)]
serie_5 = 
print("Serie 5:", )

In [None]:
# Create a series object from a list as values and a list as index
list_index = ['a', 'b', 'c', 'd', 'e']
list_values = [1, 2, 3, 4, 5]
serie_6 = 
print("Serie 6:\n", serie_6)

for index, value in :
    print(f"Index: {index}, Value: {value}")

In [None]:
# Query a Series object by boolean indexing
print("Serie 6 > 2:", )
print("-"*10)
print()

In [None]:
# Query a Series object by faccy indexing
print("Serie 6[['a', 'b']]:", )

In [None]:
# Query a Series object using loc[]
print("Serie 6.loc[['a', 'b']]:", )

In [None]:
# Query a Series object using iloc[]
print("Serie 6.iloc[0:3]:", )

## The DataFrame Data Structure

A __pandas DataFrame__ is a _two-dimensional_, _labeled_ data structure in _Python_ that is commonly used for _data manipulation and analysis_. It consists of _rows_ and _columns_, similar to a table in a relational database.

The __DataFrame__ can store _heterogeneous data types_ and provides various operations and functions to perform data manipulation, filtering, grouping, and statistical analysis.

To access and manipulate the data in the __DataFrame__, you can use various _methods_ and _attributes_ provided by the __pandas__ library.

For more information on __pandas DataFrame__, refer to the [official pandas documentation](https://pandas.pydata.org/docs/reference/frame.html).

In [None]:
import pandas as pd

In [None]:
# create dataframes from lists
list_example = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
df = 
print("Dataframe from list:", )
print("Dataframe shape:", )
print("Dataframe columns:", )
print("Dataframe index:", )

In [None]:
# create a dataframe from a list of dictionaries
list_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
df = 
print("Dataframe from list of dictionaries:", )
print("Dataframe columns:", )
print("Dataframe index:", )

In [None]:
# create a dataframe from a csv file
df_csv = 
print()

In [None]:
# create a dataframe from a json file
df_json = 
print()

In [None]:
# describe a dataframe
print("Dataframe csv describe:\n", )
print("*"*10)
print("Dataframe json describe:\n",)

In [None]:
# get information about a dataframe
print("Dataframe csv info:\n", )
print("="*50)
print("Dataframe json info:\n", )

In [None]:
# indexes and columns
df_changed_index = 
print("Dataframe csv changed index:\n", )
print("*"*50)
print("Dataframe csv changed index columns:\n", )
print("Dataframe csv changed index index:\n", )

In [None]:
# rename columns
df_renamed = 
df_renamed.head()

## Using Datetime into Pandas

In [None]:
# converting a column to datetime with to_datetime()
data_df = 
print("DataFrame Info:\n", )


print("Replace date as String:\n", )


print("Sales dataframe with Date column as datetime:\n", )

In [None]:
# converting a column from datetime to string with strftime()

print(data_df.head())

In [None]:
# converting a column from datetime to a timestamp with timestamp(), then converting it to int64

print(data_df.head())

In [None]:
# descriptive statistics of a dataframe
print()

In [None]:
%%timeit -n 100
import numpy as np

print(np.round(np.sum(), 3))

In [None]:
%%timeit -n 100
import numpy as np
total = 0
for  in :
    total += 
print(np.round(total, 3))

## Queries and Transformations

In [None]:
# drop a column

print(data_copy_df.columns)
data_copy_df = 
data_copy_df.head()

In [None]:
# drop a row

data_copy_df = 
data_copy_df.head()

In [None]:
# query a dataframe by column

print(type()
print(.head())

In [None]:
# query a dataframe by row with loc
print(data_df.)

In [None]:
# query a dataframe by row with iloc
print(data_df.)

In [None]:
# query a dataframe using a boolean mask
print()

In [None]:
# change all column names to Capital Case

data_df.head()

In [None]:
# get availables values in a column
print(.size)
print(.unique())

In [None]:
# query a dataframe using query()


In [None]:
# get missing values using isnull()

print(.info())
print("*"*50)
print()

In [None]:
import numpy as np

# fill missing values using fillna()
print("Fill missing with FillNA:\n", )


# fill missing attention_time with average time
print("Fill missing attention_time with average time:\n", )

# fill missing city with not reported
print("Fill missing city with not reported:\n", )

# fill missing values using fillna() in date_time column with interopolation
print("Fill missing values using fillna() in date_time column with interpolation:\n", )


In [None]:
# drop missing values using dropna()
print("Drop missing values using dropna():\n", )