# Basics

Getting Started.  


**Outcomes**

- Programming concepts  
  
  - Understand variable assignment  
  - Know what a function is and how to figure out what it does  
  - Be able to use tab completion  
  
- Numbers in Python  
  
  - Understand how Python represents numbers  
  - Know the distinction between `int` and `float`  
  - Be familiar with various binary operators for numbers  
  - Introduction to the `math` library  
  
- Text (strings) in Python  
  
  - Understand what a string is and when it is useful  
  - Learn some of the methods associated with strings  
  - Combining strings and output  
  
- True and False (booleans) in Python  
  
  - Understand what a boolean is  
  - Become familiar with all binary operators that return booleans  

## Outline

- Getting Started

# Getting Started

**Prerequisites**

- Good attitude  
- Good work ethic  


**Outcomes**

- Understand what a programming language is.  
- Know why we chose Python  
- Know what the Jupyter Notebook is  
- Be able to start JupyterLab in the chosen environment (cloud or personal computer)  
- Be able to open a Jupyter notebook in JupyterLab  
- Know Jupyter Notebook basics: cell modes, editing/evaluating cells  

## Welcome

Welcome to the start of your path to learning how to work with data in the
Python programming language!

A programming language is, loosely speaking, a structured subset of natural
language (words) and special characters (e.g. `,` or `{`) that allow humans
to describe operations they would like their computer to perform on their behalf.

The programming language translates these words and
symbols into instructions the computer can execute.

### Why Python?

Among the hundreds of programming languages to choose from, we chose to teach you Python for the
following reasons:

- Easy to learn and use (relative to other programming languages).  
- Designed with readability in mind.  
- Excellent tools for handling data efficiently and succinctly.  
- Cemented as the world’s [third most popular](https://www.zdnet.com/article/programming-language-of-the-year-python-is-standout-in-latest-rankings/)
  programming language, the most popular scripting language, and an increasing standard for
  [data analysis in industry](https://medium.com/@data_driven/python-vs-r-for-data-science-and-the-winner-is-3ebb1a968197).  
- General purpose: Initially you will learn Python for data analysis, but it
  can also used for websites, database management, web scraping, financial
  modeling, data visualization, etc.  In particular, it is the world’s best language for
  [gluing](https://en.wikipedia.org/wiki/Glue_code)  those different pieces together.  


However, the general purpose nature of Python comes at a cost: it is often said that Python is “the
best language for nothing but the second best language for everything”.

We aren’t sure this is true, but a more optimistic view of that quote is that Python is a great
language to have in your toolbox to solve all sorts of problems and patch them together.

A versatile “second-best” language might be the best one to learn first.

Some other languages to consider:

- R has an impressive ecosystem of statistical packages, and is defensible as a choice for pure
  data science. It could be a useful second language to learn for projects that are entirely
  statistical.  
- Matlab has much more natural notation for writing linear algebra heavy code.  However, it is:
  (a) expensive; (b) poor at dealing with data analysis; (c) grossly inferior to Python as a
  language; and (d) being left behind as Python and Julia ecosystems expand to more packages.  
- Julia is in part a far better version of Matlab, which can be as fast as Fortran or C.  However,
  it has a young and immature environment and is currently more appropriate for academics and
  scientific computing specialists.  


Another consideration for programming language choice is runtime performance. On this dimension,
Python, R, and Matlab can be slow for certain types of tasks.

Luckily, this will not be an issue for data science and the types of analysis we will do in this
course, because most of the data analytics packages in Python (and R) rely on high-performance
code written in other languages in the background.

If you are writing more traditional scientific/technical computing in Python, there are
[things that can help](http://numba.pydata.org/) make Python faster in some situations,
but another language like Julia may be a better fit.

## First Steps

We are ready to begin writing code!

In this section, we will teach you some basic concepts of programming
and where to search for help.

### Variable Assignment

The first thing we will learn is the idea of *variable assignment*.

Variable assignment associates a value to a variable.

Below, we assign the value “Hello World” to the variable `x`

In [5]:
"My name is ..."

Once we have assigned a value to a variable, Python will remember this
variable as long as the *current* session of Python is still running.

Notice how writing `x` into the prompt below outputs the value
“My name is ”.

In [8]:
# y

However, Python returns an error if we ask it about variables that have not yet
been created.

It is also useful to understand the order in which operations happen.

First, the right side of the equal sign is computed.

Then, that computed value is stored as the variable to the left of the
equal sign.


Keep in mind that the variable binds a name to something stored in memory.

The name can even be bound to a value of a completely different type.

In [None]:
x = 2
print(x)
x = "something else"
print(x)

### Code Comments

Comments are short notes that you leave for yourself and for others who read your
code.

They should be used to explain what the code does.

A comment is made with the `#`. Python ignores everything in a line that follows a `#`.

Let’s practice making some comments.

In [9]:
i = 1  # Assign the value 1 to variable i
j = 2  # Assign the value 2 to variable j

# We add i and j below this line
i + j

3

## Functions

Functions are processes that take an input (or inputs) and produce an output.

If we had a function called `f` that took two arguments `x` and
`y`, we would write `f(x, y)` to use the function.

For example, the function `print` simply prints whatever it is given.
Recall the variable we created called `x`.

In [11]:
print(x)

hello, world


### Getting Help

We can figure out what a function does by asking for help.

In Jupyter notebooks, this is done by placing a `?` after the function
name (without using parenthesis) and evaluating the cell.

For example, we can ask for help on the print function by writing
`print?`.

Depending on how you launched Jupyter, this will either launch

- JupyterLab: display the help in text below the cell.  
- Classic Jupyter Notebooks: display a new panel at the bottom of your
  screen.  You can exit this panel by hitting the escape key or clicking the x at
  the top right of the panel.  

In [12]:
# print? # remove the comment and <Shift-Enter>

## Objects and Types

Everything in Python is an *object*.

Objects are “things” that contain 1) data and 2) functions that can operate on
the data.

Sometimes we refer to the functions inside an object as *methods*.

We can investigate what data is inside an object and which methods
it supports by typing `.` after that particular variable, then
hitting `TAB`.

It should then list data and method names to the right of the
variable name like this:

<img src="https://datascience.quantecon.org/assets/_static/python_fundamentals/introspection.png" alt="introspection.png" style="">

  
You can scroll through this list by using the up and down arrows.

We often refer to this as “tab completion” or “introspection”.

Let’s do this together below. Keep going down until you find the method
`split`.

In [13]:
# Type a period after `x` and then press TAB.
x

'hello, world'

In [14]:
x.split()

['hello,', 'world']

We often want to identify what kind of object some value is–
called its “type”.

A “type” is an abstraction which defines a set of behavior for any
“instance” of that type i.e. `2.0` and `3.0` are instances
of `float`, where `float` has a set of particular common behaviors.

In particular, the type determines:

- the available data for any “instance” of the type (where each
  instance may have different values of the data).  
- the methods that can be applied on the object and its data.  


We can figure this out by using the `type` function.

The `type` function takes a single argument and outputs the type of
that argument.

In [15]:
type(3)

int

In [16]:
type("Hello World")

str

In [17]:
type([1, 2, 3])

list

We will learn more about each of these types (and others!) and how to use them
soon, so stay tuned!


<a id='modules'></a>

## Modules

Python takes a modular approach to tools.

By this we mean that sets of related tools are bundled together into *packages*.
(You may also hear the term modules to describe the same thing.)

For example:

- `pandas` is a package that implements the tools necessary to do
  scalable data analysis.  
- `matplotlib` is a package that implements visualization tools.


As we move further into the class, being able to
access these packages will become very important.

We can bring a package’s functionality into our current Python session
by writing