# Basic Python Tutorial

<a href="https://colab.research.google.com/drive/1JEq9HFx2JuESFOx2ZrczCSahkEuvsJiF" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab">
</a>

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).

Python is a high-level, interpreted programming language that is widely used in various fields such as `data science`, `web development`, `artificial intelligence`, and `automation`. It was first released in 1991. It has since become one of the most popular programming languages due to its simple syntax, ease of use, and vast libraries of pre-built modules that make development faster and more efficient. Overall, Python is a versatile and powerful programming language that continues to gain popularity among developers of all levels.

Since all notebooks in this course were written in Python, let us make a quick introduction based on the language. For more information, check [python.org](https://docs.python.org/3/tutorial/index.html).

![python](https://i.gifer.com/origin/25/25dcee6c68fdb7ac42019def1083b2ef_w200.gif)

[Source](https://gifer.com/en/gifs/python).

But how about a more "_poetic_" definition of what Python is all about?

> **Note:** Here, we are working with Jupyter/Colab notebooks. These notebooks are divided into text cells (written in a markup language) and code cells that can be executed. Text cells are just for reading, while code cells are supposed to perform some kind of task. Try running the code cell below by pressing the "_play_" button on the top left of the cell.


In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


One of the advantages of using Python is its vast ecosystem of libraries and modules. Every time you want to import a module/library/package, you simply install it (via `pip install your_module`) and import it via `import your_module`. The `this` module contains the `Zen of Python`, and you just imported it!

Further on the notebook we will see how to make a `pip`installation. Now let us learn something about data types and variables.

## Variables

Variables are containers for storing data values. A variable is created the moment you first assign a value to it. Python has several types of data types that can be used as variables.

Here are some examples:

- `int`: An integer number (`num = 5`).
- `float`: A number that carries a floating point (`float_num = 3.14`).
- `complex`: A complex number (`complex_num = 5 + 5j`).
- `bool`: A boolean variable stores either True or False (`is_raining = True`).
- `str`: A string variable stores text (`name = "Alice"`).
- `list`: A list is a collection of ordered elements (`fruits = ['apple', 'banana', 'orange']`).
- `tuple`: A tuple is an ordered, immutable collection of elements (`coordinates = (2, 3)`).
- `set`: A set is an unordered collection of unique elements (`numbers = {1, 2, 3, 4}`).
- `dict`: A dictionary is an unordered collection of key-value pairs (`person = {'name': 'Alice', 'age': 25}`).
- `array`: An array is a collection of similar data type values stored in contiguous memory locations (`array = [90, 85, 92, 87]`).

Other important concepts in Python are `functions` and `classes`, which give Python both characteristics of the functional and object-oriented paradigms. Let us start with functions. One of the most famous in-built functions in Python is the `print()` function, used to print the value of a variable. Let us see a couple of ways we can print out strings.

In [2]:
texto = 'Bob has' # a string
texto_2 = 'years of age.' # another string
num = 33 # an integer

print(f'{texto} {num} {texto_2}')  # fancy way of doing string formatting (a.k.a f-strings)

print(texto + ' ' + str(num) + ' ' + texto_2) # str() transform a variable (e.g., int) to a string.

print('{} {} {}'.format(texto, num, texto_2)) # another fancy way of doing string formatting

print(texto, num, texto_2) # the lazy way

Bob has 33 years of age.
Bob has 33 years of age.
Bob has 33 years of age.
Bob has 33 years of age.


Another cool function to start playing is the `input()` function. We can use it to request user input when running a Python script/notebook.


In [3]:

name = input('Type in your name and press enter: ')
age = input('Type in your age: ')

print(f'Hello {name}! You have {age} years of age.')


Hello Bob! You have 33 years of age.


If you forgot or don't know the `type` of the variable/data structure you are working with, the `type()` function can help you!


In [4]:

print(type('Hello World!'))  # string
print(type(0))  # integer
print(type(0.8))  # float
print(type(5 + 3j))  # complex
print(type(True))  # boolean
print(type([1, 2, 3, 'Hello World!', True, 0.8])) # list
print(type((0, 1))) # tuple
print(type({1, 2, 3})) # set
print(type({'a': 1, 'b': 2, 'c': 3})) # dictionary
print(type(None)) # None


<class 'str'>
<class 'int'>
<class 'float'>
<class 'complex'>
<class 'bool'>
<class 'list'>
<class 'tuple'>
<class 'set'>
<class 'dict'>
<class 'NoneType'>


One of the first reasons people want to learn to program is because they want to automate something. And at the root of automation, we have conditional statements. The most simple type of automation is when we determine something to happen if something else happens or is valid. Like, "_if the age of ID is less than 18, do not sell beer_." And we can do this by using `if` and `else` statements.

In Python, every conditional has its own block of code. Blocks of code are separated in Python by indentation instead of the "{}" used in Java or C.

In [5]:
age = input('What is your age my friend?') # input stores a string

if int(age) >= 18: # int() turns the string of a number into an integer
    print("Have a beer! 🍺")
else:
    print("No beer for you kid. 🚫")

No beer for you kid. 🚫


Python supports the usual logical conditions from mathematics:

- Equals: `a == b`
- Not Equals: `a != b`
- Less than: `a < b`
- Less than or equal to: `a <= b`
- Greater than: `a > b`
- Greater than or equal to: `a >= b`

Another useful function to know is `len()`. It tells us the length of a `string`, `list`, `tuple`, `array`, etc. Let us use the `len()` function together with the conditional `elif`. We use `elif` if we have more than one `if` statement before the `else` (`else` always goes in the end!).


In [6]:
name = input('Type in your name and press enter: ')
x = len(name)

if x < 4:
    print(f'{name} is a short name! It has less than 4 characters!')
elif x > 4:
    print(f'{name} is a long name! It has more than 4 characters!')
else:
    print(f'{name} is a name of legth {x}!')


Alice is a long name! It has more than 4 characters!


Besides using pre-made functions, you might want to create your functions. A `function()` is a block of code that only runs when it is called. You can pass data (parameters or arguments) into a function, and it will do what you specified.

> Note: Python docstrings are the string literals that appear right after defining a function, method, class, or module. They are used to provide clear instructions on how a function works.

In [7]:
def foo(x, y):
    """
    Compute the sum of two numeric arguments x and y and print the result.

    Parameters:
    x (numeric): The first argument to add.
    y (numeric): The second argument to add.

    Returns:
    None: This function does not return a value, but prints the sum of x and y.
    """
    return print(f"The sum of the arguments {x} and {y} is {x + y}.")

def list_checker(x):
    """
    Compute if an object is of list type.

    Parameters:
    x (object): The object to check for being a list.


    Returns:
    None: This function does not return a value, but prints if x
    is a list or not.
    """
    if type(x) == list:
        return print(f"This is a list.")
    else:
        return print(f"This is not a list.")

l = [1, 2, 3, 'Hello World!', True, 0.8]

foo(5, 42)
list_checker(l)
list_checker(None)

The sum of the arguments 5 and 42 is 47.
This is a list.
This is not a list.


Another important part of Python is the use of Classes. Classes can have many inherent attributes, like specific values or functions (methods) that they can perform.

In [8]:
class Bob:
	def __init__(self):
		self.name = "Bob Joe"
		self.profession = "Student"
		self.hobby = "Python Enthusiast"
	def hello_world(self):
		print("A hello from Bob! 🤗")

person = Bob()
person.hello_world()

print(person.name + " is a " + person.hobby + "!")

A hello from Bob! 🤗
Bob Joe is a Python Enthusiast!


If you want to do the same process many times over, you are talking about a `loop`. Python has two primitive loop commands:

- `while` loops.
- `for` loops.

With the `while` loop, we can execute a set of statements as long as a condition is, for example, `True.` A `for` loop can iterate over a certain range.


In [9]:
n = 0
while True:
    print(n, end=' ')  # end = ' ' to print outputs side-by-side
    n += 1
    if n == 21:
        break

print('\n')  # '\n' is the command to "break line" (new paragraph).

n = 0
for i in range(21):
    print(n, end=' ')
    n += 1

print('\n')

l = [1, 2, 3, 'Hello World!', True, 0.8]  # list
for x in l:
    print(x, end=' ')


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 

1 2 3 Hello World! True 0.8 

Another clever way to implement loops is recurively. A `function` can call other functions, but it is also possible for a function to call itself. These types of construct are termed as _recursive functions_. The canonical example of a recursive fuction is the calculation of a factorial.

In [10]:
def factorial(x):
    """
    Compute the factorial of a positive integer x.

    Parameters:
    x (int): The number whose factorial is to be computed.

    Returns:
    int: The factorial of x.
    """
    if x == 1:
        return 1
    else:
        return (x * factorial(x-1))

factorial(10)

3628800

Another useful conditional statements are `Try`, `Except`, and `Finally`:

- The `try` block lets you test a block of code for errors.
- The `except` block lets you handle the error.
- The `else` block lets you execute code when there is no error.
- The `finally` block lets you execute code, regardless of the result of the `try` and `except` blocks.


In [11]:
def get_age(age):
    """
    Compute if the inputed age is of type int.

    Parameters:
    age: the input argument to be tested.

    Returns:
    None: This function does not return a value, but prints if
    age is valid (an int) or not.
    """
    try:
        x = int(age)
        return print(f'You have {x} years of age.')
    except:
        print(f'"{age}" is not a valid age.')


get_age('Bob')
get_age(33)


"Bob" is not a valid age.
You have 33 years of age.


Here's another example:

In [12]:

def divide(num1, num2):
    """
    Compute the division of two numbers.

    Parameters:
    num1 (int or float): the numerator
    num2 (int or float): the denominator

    Returns:
    None: This function does not return a value, but prints
    the result of the division, or gives an error message
    of type `ZeroDivisionError`. After the try and except
    conditionals, it always prints "Operation complete!"
    """
    try:
        result = num1 / num2
        print(f"Result: {result}")
    except ZeroDivisionError:
        print("Error: Cannot divide by zero!")
    finally:
        print("Operation complete!")

divide(10, 5)
divide(1, 0)

Result: 2.0
Operation complete!
Error: Cannot divide by zero!
Operation complete!


In this example, the `divide()` function takes two arguments, `num1` and `num2`, and attempts to divide `num1` by `num2`. If `num2` is zero, a `ZeroDivisionError` is raised and caught in the `except` block, which prints an error message.

We could get a similar behavior using `raise`.

In [14]:
def divide(num1, num2):
    """
    Compute the division of two numbers.

    Parameters:
    num1 (int or float): the numerator
    num2 (int or float): the denominator

    Returns:
    None: This function does not return a value, but prints
    the result of the division, or gives an error message
    of type `ZeroDivisionError`.
    """
    try:
        result = num1 / num2
        print(f"Result: {result}")
    except ZeroDivisionError:
        raise TypeError("Error: Cannot divide by zero!")

divide(1, 0)

TypeError: Error: Cannot divide by zero!

You can learn more about Python operators [here](https://www.w3schools.com/python/python_operators.asp) and [here](https://www.tutorialspoint.com/python/python_basic_operators.htm).


Let us now install some cool modules and libraries. Package Installer for Python is the de facto and recommended package-management system written in Python and is used to install and manage software packages. To use it, we just need to remember this simple command: `pip`!

If you are '_piping_' in your CMD (Windows users), you can try these commands:

```bash

pip install pandas # install the pandas library
pip install pandas --upgrade # upgrades the pandas library to its latest version
pip uninstall pandas # uninstalls the pandas library

```

On a Google Colab notebook, we use `!` in front of `pip`.

```bash

!pip install pandas
!pip install pandas --upgrade
!pip uninstall pandas

```

Here is a small cheat sheet for working with `pip`.

```bash

Commands:
  install                     Install packages.
  download                    Download packages.
  uninstall                   Uninstall packages.
  freeze                      Output installed packages in requirements format.
  list                        List installed packages.
  show                        Show information about installed packages.
  check                       Verify installed packages have compatible dependencies.
  search                      Search PyPI for packages.
  wheel                       Build wheels from your requirements.
  hash                        Compute hashes of package archives.
  completion                  A helper command is used for command completion.
  help                        Show help for commands.

```

After installing a library, we can import it by a '_pseudonym_':

```python

import a_ridiculously_long_library_name as short_name

```

For example:

```python

import pandas as pd

```

### Pandas

[Pandas](https://pandas.pydata.org/) is a popular open-source library for data manipulation and analysis in `Python`. It provides powerful data structures for working with structured data, such as tables, series, and data frames.

The library is widely used in data science, finance, and other fields where large amounts of data need to be processed and analyzed. `Pandas` allow users to easily load, manipulate, and analyze data from various sources, such as CSV files, Excel spreadsheets, and SQL databases.

Its core data structure, the `DataFrame`, provides a flexible and efficient way to work with data, allowing users to perform operations such as filtering, grouping, and aggregation. `Pandas` also offers a wide range of data analysis functions and statistical tools, making it an essential tool for data scientists and analysts.

![cool_pandas](https://c.tenor.com/_TV6qVC4toAAAAAM/panda-dancing.gif)

[Source](https://tenor.com/pt-BR/view/panda-dancing-moves-funny-shaking-gif-17764808).

> Quick Tip: [Rob Mulla](https://www.youtube.com/@robmulla) has a great series of tutorials for data scientists wanting to work with `Pandas` and other `Python` libraries for data science.

One of the first things to do in Pandas is to load up some data. Hence, let us load a `CSV` file from the Hugging Face hub. For that, we are going to use the `datasets` library.

To get access to our data, let us first `pip install` the `datasets` library.

> **Note**: all datasets and models related to the course and repo are in the Hub.

In [15]:
# the `-q` (quiet) key makes the installation less "verbose"
%pip install datasets -q

Note: you may need to restart the kernel to use updated packages.


Now, we can use the `load_dataset` functionality and download our data straight from the Hub 🤗. This function downloads a `DatasetDict`, from which we can extract our data. You can also transform this dataset object into a `pandas.DataFrame`, giving you all the functionalities it offers.

In [1]:
from datasets import load_dataset

# load the dataset from the hub
dataset = load_dataset("dieineb/example_data_frame")

# turn the dataset into a pandas.DataFrame
df = dataset['train'].to_pandas()

display(df.head()) # or simply use df.head()

Downloading readme:   0%|          | 0.00/645 [00:00<?, ?B/s]

Downloading and preparing dataset None/None to C:/Users/CWLINK/.cache/huggingface/datasets/dieineb___parquet/dieineb--example_data_frame-c8c53bee3f05fa29/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec...


Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/3.81k [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]

Generating train split:   0%|          | 0/9 [00:00<?, ? examples/s]

Dataset parquet downloaded and prepared to C:/Users/CWLINK/.cache/huggingface/datasets/dieineb___parquet/dieineb--example_data_frame-c8c53bee3f05fa29/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec. Subsequent calls will reuse this data.


  0%|          | 0/1 [00:00<?, ?it/s]

Unnamed: 0,First Name,Last Name,Gender,Country,Age,Date,Id
0,Dulce,Abril,Female,United States,32,15/10/2017,1562
1,Mara,Hashimoto,Female,Great Britain,25,16/08/2016,1582
2,Philip,Gent,Male,France,36,21/05/2015,2587
3,Kathleen,Hanner,Female,United States,25,15/10/2017,3549
4,Nereida,Magwood,Female,United States,58,16/08/2016,2468


Pandas `DataFrames` can be understood as tables with `columns` and an `index`:

- `df.columns` gives you the name of all columns in a `DataFrame`.
- `df.index` gives you the positions of all indexes in a `DataFrame`.

We can easily turn this `pandas.series` into a list by using the `list()`. And if you want an array, you can use the `.values`.

In [17]:
print(list(df.columns))
print(list(df.index))
print(df.columns.values)
print(df.index.values)


['First Name', 'Last Name', 'Gender', 'Country', 'Age', 'Date', 'Id']
[0, 1, 2, 3, 4, 5, 6, 7, 8]
['First Name' 'Last Name' 'Gender' 'Country' 'Age' 'Date' 'Id']
[0 1 2 3 4 5 6 7 8]


Thus, if we want a list of all names in the column `First Name`, we can turn this column into a `pandas.series` or an array/list.

In [18]:
display(df['First Name'])
print(list(df['First Name']))

0       Dulce
1        Mara
2      Philip
3    Kathleen
4     Nereida
5      Gaston
6        Etta
7     Earlean
8    Vincenza
Name: First Name, dtype: object

['Dulce', 'Mara', 'Philip', 'Kathleen', 'Nereida', 'Gaston', 'Etta', 'Earlean', 'Vincenza']


It is good practice to keep the name of your columns with `snake case` (each space is replaced with an underscore). Suppose you try to call `df.First Name.values`. You are going to get an error. But `df.First_Name.values` would work just fine. So, let us change the name of these columns to conform with `snake case` style.

In [19]:
df.columns = ['first_name', 'last_name', 'gender', 'country', 'age', 'date', 'id']

df.first_name.values

array(['Dulce', 'Mara', 'Philip', 'Kathleen', 'Nereida', 'Gaston', 'Etta',
       'Earlean', 'Vincenza'], dtype=object)

Let us now make a `for` loop that iterates over our list of names (first and last) and puts them together.

In [20]:

for i in range(len(df)): # loops for the length of the DataFrame
    name = df.first_name[i] # takes the i-th first name
    last = df.last_name[i] # takes the i-th last name
    print(f'{name} {last}.\n') # puts both together


Dulce Abril.

Mara Hashimoto.

Philip Gent.

Kathleen Hanner.

Nereida Magwood.

Gaston Brumm.

Etta Hurn.

Earlean Melgar.

Vincenza Weiland.



We could get a list of all these values using a "one-liner" technique called list comprehension, which is a fancy way to iterate over lists.

In [21]:
[f'{df.first_name[i]} {df.last_name[i]}' for i in range(len(df))]

['Dulce Abril',
 'Mara Hashimoto',
 'Philip Gent',
 'Kathleen Hanner',
 'Nereida Magwood',
 'Gaston Brumm',
 'Etta Hurn',
 'Earlean Melgar',
 'Vincenza Weiland']

`Pandas` also comes with pre-built functions for statistical analysis:

```python
- .mean() # mean of the distribution
- .max() # maximum value in the distribution
- .min() # minimum value in the distribution
- .std() # the standard deviation of the distribution
- .var() # the variance of the distribution
```

To get our results nice and readable, we can use `round()` to 'round the value to a certain decimal place:

```python
x = 3.7777
round(x, 3) = 3.777
round(x, 2) = 3.77
round(x, 1) = 3.7
```

And we can achieve the same result with string formatting:

```python
print(f"The income tax is ${3.777:.3f}")
print(f"The income tax is ${3.777:.2f}")
print(f"The income tax is ${3.777:.1f}")
```


In [22]:
mean_age = df.age.mean()
max_age = df.age.max()
min_age = df.age.min()
std_age = df.age.std()
var_age = df.age.var()

print(f'The mean age of users is our dataset is: {mean_age:.2f}. \n')
print(f'The maximum age in our dataset is: {max_age:.2f}. \n')
print(f'The minimum age in our dataset is: {min_age:.2f}. \n')
print(f'The standard deviation in age in our dataset is: {std_age:.2f}. \n')
print(f'The variance in age in our dataset is: {var_age:.2f}. \n')


The mean age of users is our dataset is: 35.89. 

The maximum age in our dataset is: 58.00. 

The minimum age in our dataset is: 24.00. 

The standard deviation in age in our dataset is: 13.15. 

The variance in age in our dataset is: 172.86. 



### Plotly

[`Plotly`](https://plotly.com/python/) is a data visualization library that allows users to create interactive, high-quality graphs and charts. It provides a range of visualization tools, including scatter plots, line charts, bar charts, heatmaps, and more.

The library offers a simple syntax for creating interactive visualizations that can be easily customized and shared. With `Plotly`, users can add annotations, hover labels, and zoom and pan features to their visualizations, making exploring and understanding complex data sets easy.

![cool_graph](https://i.stack.imgur.com/7MGHV.gif)

[Source](https://stackoverflow.com/questions/71060711/how-to-animate-line-in-scatter-plot-using-plotly-express).

The fast way to create plots with `Plotly` is by using `plotly.express`. Let us create a `histogram` to show the frequency of `countries` in our `DataFrame`.

In [23]:
import plotly.express as px

# df is our DataFrame
# x="Country" means "count the values in the Country column"

fig = px.histogram(df, x="country") # histogram is an approximate representation of some data distribution

fig.update_layout(template='plotly_dark',  # make the plot look fancy!
                  title='User distribution by Country',
                  paper_bgcolor='rgba(0, 0, 0, 0)', # make the plot tranparent
                  plot_bgcolor='rgba(0, 0, 0, 0)' # make the background tranparent
                  )
fig.show()  # show the plot


Our `DataFrame` is really small, and we don't have much to plot. Thus let us create some fake data points using `Numpy`.

### NumPy

[`NumPy`](https://numpy.org/) (short for "_Numerical Python_") is a popular `Python` library for numerical computing. It provides powerful tools for working with arrays and matrices and a range of mathematical functions for working with numerical data.

The library also provides tools for linear algebra, Fourier transforms, random number generation, and more. `NumPy` is widely used in scientific computing, data analysis, and machine learning and is a key component of many popular data science libraries in ` Python`.

![numpy](https://miro.medium.com/max/1400/1*Nhz7M4r_x8MuJtZ1orr0jg.gif)

Here are some useful `NumPy` functions to get you started:

- `np.random` allows as to sample randomly from a variety of different distributions:

    - [`rand`](https://numpy.org/doc/1.16/reference/generated/numpy.random.rand.html#numpy.random.rand "numpy.random.rand"): Random values in a given shape.
    - [`randn`](https://numpy.org/doc/1.16/reference/generated/numpy.random.randn.html#numpy.random.randn "numpy.random.randn"): Return a sample (or samples) from the "_standard normal_" distribution.
    - [`randint`](https://numpy.org/doc/1.16/reference/generated/numpy.random.randint.html#numpy.random.randint "numpy.random.randint"): Return random integers from _low_ (inclusive) to _high_ (exclusive).
    - [`random`](https://numpy.org/doc/1.16/reference/generated/numpy.random.random.html#numpy.random.random "numpy.random.random"): Return random floats in the half-open interval (0.0, 1.0).
    - [`choice`](https://numpy.org/doc/1.16/reference/generated/numpy.random.choice.html#numpy.random.choice "numpy.random.choice"): Generates a random sample from a given 1-D array.

> Note: You can learn more about `NumPy` functions [here](https://www.w3schools.com/python/numpy/default.asp) and [here](https://numpy.org/doc/stable/user/quickstart.html).

Let's first try the `np.random.rand()` function.

In [28]:
import numpy as np

# x will be equal to an array ("a collection of similar data elements")

x = np.random.randn(10)

print(f'The object {x} is of the type {type(x)}.')

# the elements of this array are randomly sampled from the normal distribution

fig = px.histogram(x=x)

fig.update_layout(template='plotly_dark',
                  title='"Kind" of a Normal distribuiton',
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)'
                  )
fig.show()

The object [-0.07484795 -0.07432695  1.18287033 -0.26272749 -0.21991526 -1.80951106
 -0.34873745 -0.38617293  0.18863733  0.97912227] is of the type <class 'numpy.ndarray'>.


If you make $x$ larger, you are gonna see a "more nicely drawn" normal distribution.


In [29]:
x = np.random.randn(100000)

fig = px.histogram(x=x)

fig.update_layout(template='plotly_dark',
                  title='A nice Normal distribuiton',
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)'
                  )
fig.show()


You can also use `np.random` to `choose` a random element from a list with `np.random.choice()`.


In [31]:
fruits = ['🍎', '🍊', '🍌', '🥭', '🍇', '🍓']

print(f'Today I will eat: {np.random.choice(fruits)}!')


Today I will eat: 🍎!


`Numpy` is all about working with arrays of n-dimensions. Here are some `array` manipulation tricks:

```python

- my_list[0] # selects the first element of a list/array
- my_list[-1] # selects the last element of a list/array
- my_list[0:2] # selects form the firts to the third element of a list/array
- my_list[:: -1] # reverts a list/array
- np.array(my_list) # turns a list into a numpy 1D-array

```


In [32]:
print(fruits[0])
print(fruits[-1])
print(fruits[0:5])
print(fruits[:: -1])
fruits = np.array(fruits)
print(type(fruits))


🍎
🍓
['🍎', '🍊', '🍌', '🥭', '🍇']
['🍓', '🍇', '🥭', '🍌', '🍊', '🍎']
<class 'numpy.ndarray'>


For last, let us learn how to build a line. `np. linspace()` creates an array of evenly spaced samples, calculated over the interval `[start, stop]`. We can use this to create functions we can graph using `plotly`!


In [33]:
x = np.linspace(-3, 3, num=100) # num = how_many_points_do_you_want

fig = px.line(x=x, y=x)
fig.update_layout(template='plotly_dark',
                  title='A linear function',
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')
fig.show()

fig = px.line(x=x, y=x**2)
fig.update_layout(template='plotly_dark',
                  title='A square function',
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')
fig.show()

fig = px.line(x=x, y=x**3)
fig.update_layout(template='plotly_dark',
                  title='A cubic function',
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')
fig.show()


That is all folks! For more advanced stuff, go to our [Basic Pandas/Scikit-learn/NumPy Tutorial](https://github.com/Nkluge-correa/teeny-tiny_castle/blob/fa17764aa8800c388d0d298b750c686757e0861e/ML%20Intro%20Course/3_Basic_Pandas_Scikit_learn_NumPy_Tutorial.ipynb). You can also learn much of what there is to know about Python in the tutorial section of [python.org](https://docs.python.org/3/tutorial/index.html).

---

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).
