# How do I use basic Python skills to help inform a travel agency?

## Goals

By the end of this case, you will be able to understand the basics of Python from a programming/software engineering perspective. This is crucial for you to begin to perform actual data work using Python.

You should further have a working knowledge of various fundamental building blocks of Python (listed at the end of the case). This means being able to write simple code to perform analytical tasks such as creating metrics for making a business decision.

## Introduction

<img src="data/images/san_agustin.jpg" alt="drawing" width="1000" height="250"/>

**Business Context.** You are a data science consultant for a new travel agency called *World Travelers*. The agency has hundreds of thousands of users. Because they are a relatively new company, they have not set up their data pipeline yet, so they're collecting the data about the travelers manually. 

Since the dataset collected can get quite large and increases by the day, they need urgent advice to organize their data and get it into a better format so they can choose the right subscription plan for the analytics platform that they are looking to purchase.

**Business Problem.** Your task is to **investigate the raw data the client has provided and help them answer some questions about their travelers based on it**.

**Analytical Context.** The data provided by the client is in the CSV (comma-separated values) file *travelers_information.csv*. This file contains some trip data, including the countries the travelers have visited through the agency, the number of travelers who went to those countries last year (2019), and the travelers' age groups.

The CSV file has the following columns: 

- **Countries**: Country names
- **Number of Travelers**: Number of travelers for that country
- **Age (0,18]**: Travelers from age 0 to age 18
- **Age (18,35]**: Travelers from age 19 to age 35
- **Age (35,50]**: Travelers from age 35 to age 50
- **Age (50,100]**: Travelers from age 50 to age 100
- **Pets**: Number of pets that traveled for that country

<img src="data/images/agency_data.png" alt="drawing" width="600" height="180"/>

Of course, in order to do all of this, you will need to know the basics of Python programming. Let's jump into that first.

## What is programming?

**Programming** is just act of writing clear, unambiguous instructions that a computer can understand and execute to achieve a desired end result. You can think of it as a cooking recipe - instructions are given in sequence, and it is assumed that if you follow them faithfully, you will achieve the intended result. This is why we first need to know all the ingredients, what to do with them, and how are we supposed to use them before writing our programs.

More specifically, programming requires three things:

1. A well-defined plan of attack with clear steps
2. Brief comments about the steps, so that later when you re-read your code, it will be easy to know what you were intending to do
3. Clear, working code for each step

### Variables in Python

In Python, a **variable** is a place where we store certain information. Depending on the type of information that we choose to store (e.g. text, numbers, true/false (also known as *Booleans*), etc.), the variable will be of one type or another. 

Each variable has to have a name with which to refer to it. Python takes into account if we write the variable in upper or lower case (which is known as being *case-sensitive*). Thus, a variable called `a_1` is not the same as one called `A_1`.

We have to remember that the name of the variable can't match the names of restricted Python **keywords** (if, for, while, do, etc.). You will learn more about these keywords later. We also cannot use variable names that include things like accents or certain special characters.

Defining variables in Python is done as follows:

* **Assigning a value**: `variable = value`

* **Assigning multiple values**: `variable_1, variable_2 = value1, value2`

You can then print the value of a variable to the screen using the function `print()`. Here is an example:

In [None]:
#This is an example
variable_1 = 10
print(variable_1)

### Exercise 1:

Create the variables for the number of tourists in each country last year. Remember that the name of a variable can't have spaces in the middle - use the underscore key to bridge natural gaps where spaces would normally go.

**Answer.**

---------

### Exercise 2:
How many people went to Greece? 

**Answer.**

---------

Now, it might make sense to figure out the total number of travelers that went to a select set of countries. This is the purpose of **arithmetic operators**. You've probably heard of most of these before in your grade school math classes. They include:

* `+`: Addition 

* `-`: Subtraction

* `*`: Multiplication 

* `/`: Division (with decimals)

* `%`: Remainder after division, also known as "modulus"

* `**`: Power/Exponentiation

* `//`: Division (without decimals)

### Exercise 3:

How many travelers went to Congo, Australia, and Indonesia? What is the total number of travelers from these three countries?

**Answer.**

---------

### Exercise 4:

How many people went to the continents of Oceania and South America? 

**Bonus**: How many travelers went to the rest of the continents?

**Answer.**

---------

### Numerical variables

There are two main types of variables which constitute numbers - the integers (`int`) and the real numbers (`float`):

In [None]:
# Let's add two new countries and the number of travelers who went to those specific countries:
peru = int (879)
croatia = float (125.5)

print("Peru:", peru)
print("Croatia", croatia)

Now why would Croatia have 125.5 travelers? Is that possible? Well, according to the agency's data, they count a pet as 0.5 travelers. We can decide for ourselves if that's a sensical way to count pets, but at least it's consistent with the agency's documentation.

If we want to know the data type of a variable, we can use the `type()` method:

In [None]:
type(peru)

In [None]:
type(croatia)

### Exercise 5:

How many people went to Peru and Croatia? 

**Answer.**

---------

**Bonus:** If you ever need to use complex numbers in Python, good news, Python recognizes them!

In [None]:
# Let's say the number of travelers that have ever been in Saturn is a complex number, because we 
# don't have any real proof by now. 

saturn = 22j

print(saturn)
type(saturn)

### Text variables

Variables that store text are called strings (`str`). They must be enclosed in single quotes (') or double quotes ("), or if the text occupies several lines, enclosed in triple double quotes ("""):

In [None]:
destiny ="Countries"
clients ='Travelers'

print(destiny)
print(clients)

Let's prepare a welcome to the new travelers:

In [None]:
print("Welcome", clients, "we hope you have lots of fun in all the", destiny, "you want to visit!")

In [None]:
type(clients)

In [None]:
#We can create a new variable that includes a multiline text.

welcome_back = """Welcome
back
dear
travelers"""
print(welcome_back)

### Exercise 6:

The city of Chicago has given the company "World Travelers" a specific ID. Create this variable `agency_id`, and set it equal to the string "112235". You can make it a string since you won't need to perform any arithmetic operations with it.

**Answer.**

---------

In [None]:
agency = "World Travelers"
print(agency + agency_id)

We get a **concatenation**! So yes, this is how we can join two strings, we just have to use the `+` operator:

In [None]:
print("Welcome to the best travel agency! This is " + agency )

### Boolean variables

The **boolean** type is a binary data type; that is, it can only take on one of two possible values: 0 (zero) or 1 (one). 

The two possible values of this data type lend themselves to different interpretations: Yes / No, On / Off, True / False, among others. In Python and other programming languages, the constants True (1, true) and False (0, false) have been chosen.

In [None]:
# We can create a boolean variable with the value we want, True or False

boolean_variable_1 = True
boolean_variable_2 = False

print(boolean_variable_1)
print(type(boolean_variable_1))
print(boolean_variable_2)
print(type(boolean_variable_2))

### Empty values

We can also define variables which represent the concept of nothing, by assigning them the value `None`:

In [None]:
null_variable = None

print(null_variable)
type(null_variable)

The agency is required to report some countries where they operate but had no travelers. These include the UK, Poland and Norway. So we assign these countries the value `None`:

In [None]:
uk = None
poland = None
norway = None

print (uk, poland, norway)

## Operators in Python

**Operators** are essentially actions that you can perform on Python variables or objects. The Python language supports the following types of operators:

1. Arithmetic Operators<br>
2. Comparison (Relational) Operators<br>
3. Logical Operators<br>
4. Assignment Operators<br>
5. Bitwise Operators<br>
6. Membership Operators<br>
7. Identity Operators

We'll discuss the first three of these here; the rest will be left to a future case.

### Arithmetic operators

We covered these in an earlier section - now let's do some work with them!

### Exercise 7:

The agency wants to get the **percentage** of travelers for each country that qualify as kids (i.e. age from 0 to 18 years). They want to start by getting the percentages for the top five countries with the most travelers. We already know these countries are Egypt, United States, Spain, Indonesia, South Africa. Go ahead and compute these figures.

**Hint:** Use the `round()` function, which allows you to round off your result to a specified number of decimal places, to allow you to display the numbers more easily.

<img src="data/images/top_5_countries.png" alt="drawing" width="700" height="180"/>

**Answer.**

---------

### Comparison (Relational) operators

**Comparison operators** are used to ascertain the relation between two quantities or objects. Again, most of these will be familiar to you from grade school math:

| Symbol | Task Performed |
|----|---|
| == | True, if values are equal |
| is | True, if identical, i.e. the **same** object  |
| !=  | True, if not equal to |
| < | less than |
| > | greater than |
| <=  | less than or equal to |
| >=  | greater than or equal to |
| in  | test pertenence to a collection (list, set, dictionary) |

### Exercise 8:

Let's compare the the percentages that we got from the previous exercise:

1. Is the percentage of underage travelers in Egypt the same as in Spain? 
2. Is the percentage of underage travelers in the United States greater than the percentage of underage travelers in Indonesia? 
3. Is the percentage of underage travelers in South Africa less than the percentage of underage travelers in Spain? 
4. Find out if the percentage of under age travelers in South Africa is different from the percentage of underage travelers in Indonesia by using the `!=` operator.

**Answer.**

---------

### Logical operators

**Logical operators** are used to combine **conditional statements** (i.e. statements which the code will determine in real-time are true or false). These condition statements often use the comparison operators described in the previous section.

* **and**: Returns True if both statements are true, or False otherwise
* **or**: Returns True if one of the statements is true, or False otherwise
* **not**: Reverse the result, returns False if the result is true, or True if the result is false

### Exercise 9:

1. Determine if the percentage of underage travelers in South Africa is greater than 4% **and** if the percentage of under age travelers in Indonesia is less than 10%.

2. Determine if the percentage of underage travelers in United States is greater than 15% **or** if the percentage of underage travelers in Egypt is greater than 40%.

**Answer.**

---------

## Some common errors

1. If we try to use a variable that has not been previously defined or initialized, the interpreter will show us an error:
```python
print(abc)
NameError: name 'abc' is not defined
```

2. If we try to perform mathematical operations between numeric and string variables, we will get an error as well:
```python
a, b, c = 1223, 342, "Hello"
print(a+b+c)
TypeError: unsupported operand type(s) for +: 'int' and 'str'
```

What other types of errors do you think can happen often?

## Conclusions & Takeaways

Python has both easy-to-understand syntax and a large variety of available tools, which makes Python a unique programming language. Because Python code is highly readable, it saves valuable time and resources.

Python has uses far beyond what we've covered in this case, from game development to data visualization to networking. It is one of the best programming languages you can learn to further your career and is the fastest growing major programming language in the world.