# Introduction to Data Processing with Python 

This is an introduction to Python for someone with little or no prior experience in programming.

In this introduction we will cover:
- Why we use computers to solve problems
- Why we use programs to control how computers work
- Why we use programming languages to write programs
- Why Python is a good choice of a programming language

This is designed for someone who:
- knows their way around a computer, how to use a file list window to find files
- can use a browser for the internet, how to open websites and download files on their computer
- has a basic understanding of math, including variables, functions, and comparing numbers 

This will teach how to use computers to automate calculations and tasks using data.



# Computers

Computers can solve problems involving any kind of *data* that can be store electronically, 
numbers and text.
Computers can process large amounts of *data* faster and more accurately than people.
They are good for tasks that are tedious and do not need a person to solve them.

The advantage of computers is that they can process instructions tirelessly, faster and more accurately than people. 

# Data

*Data* can represent terms, facts, measurements, or statistics.
data are 
number
text
  characters typed on the keyboard


*Information* is the result of summarizing, classifying, and interpreting raw data so that it can be used to make decisions.

For example, the temperature outside now is *data*. 

You can decide whether to go out in shorts or not based on it, but it has limited use.

Combining that data into temperature trends for the week, or the minimum and maximum temperatures for a day, is *information* that can be used to make predictions about the future weather. 

*Data processing* converts data to information for any organization that needs to use data.



Data processing consists of several steps.
1. Collecting raw data from external sources.
1. *Preprocessing* or *cleaning* data to reduce errors, duplications, inconsistencies, and inaccuracies.
1. Storing usable data in files or databases to make it available to analyze.
1. Interpreting data using methods such as statistics and data science to produce actionable information.
1. Providing the resulting information for decision making and planning in a useful and understandable way.

![image ](data-processing1.jpg)

# Programs
Computers can store instructions to solve problems as *programs*.
Storing *programs* allows using them rto epeatedly to solve problems.



programs convert input to output
input data
output information
inside process
logic
calculations


Computers run programs made up of *instructions* that process data, like a recipe.
- Programmers take input data, [*algorithms*](#glossary_algorithm), and the desired output to write Python programs.



<img  src="data-processing2.jpg" width="1000">


Programs are often written to perform manual work that already exists.
The effort of writing programs is often repaid by automating tasks that are tedious and require little skill. 


# Statements 


Instructions or *statements* are steps that a computer can run.
Statements are usually in a mathematical form, 
using *variables* and *functions* like a mathematical formula.


instructions use expressions

instructions are statements

statements often in mathematical form a=b+c



# Functions 
Functions are sequence of instructions like small program.
Functions can run instructions may return values.
Python gives a full list of available functions, including those discussed below, at this link.

https://docs.python.org/3/library/functions.html


# Variables 
Variables are locations to store data

# Programming languages
- Computers [*CPUs*](#glossary_cpu) run instructions then convert input data to program output.
but have to be told how to do it precisely and completely.

programming language precise consistent

english too exact unclear ambigous inconsistent to use as programming language

Programming languages make programs easy to read and remember what they do.

## Python
Python is a programming language that allows telling the computer exactly what to do.
*Python* is a programming language that allows telling the computer what to do in a readable and easily understandable way.
Python has many ways to read data, and includes many methods that can be used to process data.
Python allows organizing data to make designing how to calculate with it easier.
- Python translates Python code into machine instructions that the computer can run.

## Program Code
instructions coded in programming language
= program code

In [11]:
## Jupyter
*Jupyter* allows mixing 
explanations, notes, discussions with
program code you can run.

SyntaxError: invalid syntax (673356158.py, line 2)

In [None]:
s="hi there"
l= [char for char in s]
print(l)
t="a-b-c--"
m=t.split("-")
print(m)

## Instructions 
Each instruction is a single step the program takes as it runs.
Instructions read data, 
do calculations,
use logic to decide what later instructions to run,
and write data from the program.
Instructions are written in a readable language that is translated into a form that a computer can run. 
This will describe the kinds of instructions that a Python program can run, what they do, and how to write them. 

<img  src="program1.jpg" width="600">

Many programs frequency perform the same tasks, 
so they use *functions* that collect groups of instructions together to save time programming.
Functions are much like programs used within programs.
Python provides a *library* of common program functions.
Much of the effort learning to program is memorizing what these functions do and how to use them in programs.

The following sections describe of basic types of data used in programs and useful functions to process them. 
The sections use knowledge from previous sections so you are meant to go through them one by one. 
When you are done, you will be able to write basic Python programs like the examples at the end. 

The Internet has a wealth of information to make programming easier and more effective. 
[python.org](python.org) and [w3schools.com](w3schools.com) are good sources of basic information on using Python. 
There are many useful [books](https://hackr.io/blog/best-python-books-for-beginners-and-advanced-programmers) on Python or further study and to use as references. 
Searching for the type of program you want to write and looking for examples you can use as samples can save you a lot of time programming. 
Python gives you an error you cannot understand, search for the error and you may find an explanation or solution on websites like [stackexchange.com](stackexchange.com) and [stackoverflow.com](stackoverflow.com).

## Using Jupyter
Jupyter is a web application that lets you write explanation and figures alongside program code that you can run inside a single *Notebook*, which allows you to run and test out program code. 
You can add your own explanations and notes that you can save and reload at your next session. 
You can perform and record experiments, and then you or others can rerun them at a later time. 

A notebook is a series of *cells*. There are two kinds of cells, *Markdown* and program code.
There is a menu and toolbar that give you functions to run, add, modify, and delete cells. 

<img  src="menu1.jpg">

Select a cell by clicking on it.
Clicking the <img src="plus1.jpg"> icon will add an empty cell below the current cell.
A new cell is initially a program code cell.
Program code can go into cells like the one following.
You run code in Jupyter by selecting the cell and pressing `Shift+Enter`, or selecting **Run Cell** 
in the **Cell** menu. You can also click the <img src="triangle1.jpg" align="top"> icon on the menu above.
The result of running the code will appear below the cell.

In [None]:
print("Hello world")

<table><tr>
<td><img src="markdown1.jpg" align="top"><br></td>
<td>
Explanations like this and notes go in Markdown cells.
To change a new cell to a Markdown cell, select <b>Markdown</b> from the <b>Markdown</b> menu. 
</td>
</tr></table>

Explanations and notes go in Markdown cells.
To change a new cell to a Markdown cell, select **Markdown** from the **Markdown** menu.
<br>
<img src="markdown1.jpg">

Markdown lets you format explanation and note text. 
You can use punctuation like `#` to get headings,
`*` to get italics and bold text, and `[]` and `()` to get web links.

<table><tr>
    <td>
<code># You can add headings like this
You can add *italics* or **bold** text.
## There are sub headings
You can include web links like this [Google](http://www.google.com).
</code>
    </td>
    <td>
<img src="markdown2.jpg" width="300">
    </td>
    </tr></table>

You press `Shift-Enter` to format a Markdown cell. 
Can can press `Enter` on a selected cell or double-click on a formtted Markdown cell to edit it.

You can cut or copy the current cell with the <img src="cut1.jpg" align="top"> and <img src="copy1.jpg" align="top"> icons,
or with  **Cut Cells** or **Copy Cells** from the **Edit** menu.
You can paste a cell with the <img  src="paste1.jpg" align="top"> icon,
or with **Paste Cells Below**, **Paste Cells Above**, or **Paste Cells and Replace** from the **Edit** menu.
You can delete the current cell or cells with **Delete Cells** from the **Edit** menu.

<img src="menu2.jpg" width="300">

There are also keyboard shortcuts, so `X` cuts the current cell, `C` copies it, `V` pastes below it, and `D D` deletes the current cell.

a

This is our first program in Python.
Run the code below by clicking on the cell and typing *Shift-Enter*.

In [None]:
print("Hello world.")

Here is a first program that converts a temperature in Celsius to Fahrenheit. The formula is:

&nbsp;&nbsp;&nbsp;&nbsp;Temperature in Fahrenheit = $\frac{9}{5}$ x Temperature in Celsius

Run the code below by clicking on the cell and typing *Shift-Enter*.

In [None]:
temp = eval(input('Enter a temperature in Celsius: '))
print('In Fahrenheit, that is', 9/5 * temp + 32)

What the code does is this.

- `input` is a *function* that takes the *string* `'Enter a temperature in Celsius: '` in quotes and prints it out.
It then waits for you to type an answer and returns that as a *string*.
- `eval` is a *function* that takes the *string* from `input` and converts it to a number.
- The number is given a name by *assigning* it to the *variable* `temp` to remember to use later.
- `print` is a *function* that takes the *string* in quotes and prints it out, along with the temperature in `temp` converted from Celsius to Fahrenheit.



*Strings* are text that can be inside either two double quotes `"` or two single quotes `'`. 
The same quote starts and ends the string. 

In [None]:
print("this is a double-quoted string")
print('this is a single-quoted string')

If you need to put a double quote in a string, you can use a single-quoted string. 

In [None]:
print("this is a double-quoted string with a single quote ' in it")

If you need to put a single quote in a string, you can use a double-quoted string.

In [None]:
print('this is a single-quoted string with a double quote " in it')

## Variables
Programs pass values from instruction to instruction using *variables*. 
This stores 5 in a variable named `x` and prints it out.
 `=` assigns 5 to `x`. 

In [None]:
x = 5
print("x is ", x)

The variable can be shown like this, where `x` is a name for a place in which the value 5 is stored.

![image](variable1.jpg)

Another way to print the variable `x` is with a *formatted string*, beginning with `f’` and ending with `'`. 
This prints out the value of the variable `x` in a formatted string, surrounding the variable `x` with `{` and `}`. 
Any Python value can be used between `{` and `}`. This can print a variable in a more concise way.

In [None]:
x = 5
print(f'x is {x}.')

A variable can be assigned a  value and then be assigned to another  value to replace it. This stores 5 in the variable `x`, and prints it out. 
It then stores 7 in `x` and prints it out.

In [None]:
x = 5
print(f'first x is {x}')
x = 7
print(f'then x is {x}')

This can be shown like this, where `x` has the data 5 stored in it that is then replaced with the data 7, where `→` means "set to new value".

![variable2.jpg](variable2.jpg)

A variable can also be assigned the value of another variable. 
This stores 5 in the variable `x`, then assigns the variable `y` to the value of `x` and prints it out.

In [None]:
x = 5
y = x
print(f'x is {x} and y is {y}')

![image](variable3.jpg)

Variables can store text strings as well as numbers.
This stores a string in the variable `s` and prints it out.

In [None]:
s = "Hello world"
print(f'the text is "{s}"')

### A note on variable names
Variable names start with a [*lowercase*](#glossary_lowercase) letter, `a` to `z`, or an [*uppercase*](#glossary_uppercase) letter, `A` to `Z`, or `_`. They can have a `_`, letter, or number inside but no other punctuation in them. They can not have a number at the beginning. 

In [None]:
x = 2
x_y = 3
_x = 4
print(f'x is {x}, x_y is {3}, _x is {4}')

Capital and small letters matter, the variable name must always look exactly the same everywhere.

In [None]:
X = 5
x = 6
print(f'X is {X}, x is {x}')

In the code here we will use different variable names for different types of data. They will be these:
- `s`, `t`, `u`, `v`, and so on will have *string* values
- `x`, `y`, `z`, and sometimes `n` will have number values

## Data
Numbers and strings are only two types of data in Python. 
Numbers and strings start as *literal constants* and are the most basic types of data in Python. 
`"hello"` and `5` are literal constants.
Python instructions can combine these values with different *operations* and can store them in variables. 
Instructions can use literal constants or can use the values stored in variables. 
Run this to add two values and store them in the variable `y`.


In [None]:
x = 5
y = x + 2
print(f'x is {x} and x + 2 is {y}')

The sections below will show examples of operations available for these data types.

### Numbers
Numbers in Python are either *integers*, which are signed whole numbers, and *real numbers* that can have a fractional part. 
When you do arithmetic on them you can use either of the types.

In [None]:
x = 5 + 3.2
print(f'5 + 3.2 is {x}')

#### Number operators
The usual arithmetic operators can be used, like addition `+` and subtraction `-`. 


In [None]:
x = 5 + 3
print(f'x = 5 + 3 is {x}')
x = 4 - 1
print(f'x = 4 - 1 is {x}')

You can do multiplication with `*` instead of `x`. 

In [None]:
x = 4 * 3.2
print(f'x = 4 * 3.2 is {x}')

We do division with `\` instead of $\div$.

In [None]:
x = x / 2.0
print(f'x / 2.0 is {x}')

There are some special operators on numbers. 
We compute exponentials, say $3^2$, like this.

In [None]:
x = 3**2
print(f'3**2 is {x}')

Another number operator is *modulo*, like "9 modulo 2", which gives the remainder after dividing 9 by 2. 
You can do modulo operations with `%`.

In [None]:
x = 9 % 2
print(f'9 % 2 is {x}')

#### Operator shortcuts
Some operators are used to change a variable's value, like `x = x + 1`. 
There is a shortcut for each of the operators `+`, `-`, `*`, `/`, `**`, and `%`.

For addition, `x = x + 1`, the shortcut is `x += 1`. 

In [None]:
x = 3
print(f'x starts as {x}')
x += 2
print(f'x += 2 is {x}')

For subtraction, `x = x - 1`, the shortcut is `x -= 1`

In [None]:
x -= 1
print(f'x -= 1 is {x}')

For multiplication, `x = x * 4`, the shortcut is `x *= 4`.

In [None]:
x *= 4
print(f'x *= 4 is {x}')

For division, `x = x / 1`, the shortcut is `x /= 1`.

In [None]:
x /= 4
print(f'x /= 4 is {x}')

For applying exponents, `x = x ** 2`, the shortcut is `x **= 2`.

In [None]:
x **= 2
print(f'x **= 2 is {x}')

For `x = x + 3`, the shortcut is `x += 3`.

In [None]:
x %= 3
print(f'x %= 3 is as {x}')

#### `abs` function
The `abs` function is a function that gives a value when it is done. It takes a number that might be negative and gives the absolute value of it.

In [None]:
x = 3
y = -5
z = -4.3
print(f'abs({x}) is {abs(x)}.')
print(f'abs({y}) is {abs(y)}.')
print(f'abs({z}) is {abs(z)}.')

#### `max` function
The value of the `max` function is the largest of the numbers give given to it as an argument.
`(1, 12, 9)` is a *list* of numbers, which is discussed further below.

In [None]:
L = (1, 12, 9)
print(f'the maximum of {L} is {max(L)}')

#### `min` function
The value of the `min` function is the smallest of the numbers give given to it as an argument.

In [None]:
L = (1, 12, 9)
print(f'the minimum of {L} is {min(L)}')

#### `round` function
The `round` function takes a real number and reduces the size of the fraction. 
It gives the integer value without the fraction if only the number is given as the argument. 
If a second argument is a number, it will keep this many digits in the fractional part.

In [None]:
x = 5.34
y = 9.2345
z = -8.431
print(f'round({x}) is {round(x)}.')
print(f'round({y}, 2) is {round(y, 2)}.')
print(f'round({z}) is {round(z)}.')

### Strings

Strings are text that is made up of *characters*, which are letters, numbers, punctuation, and some special characters called *escape characters*. 
Strings have operations and functions that combine, remove, search, and replace for parts of text. 
This is the string `"hello"`. Each character has a *position* in the string, and the positions start counting from `0` rather than `1`.

<img src="string1.jpg" width="400">

#### Escape characters
Escape characters are characters following a *backslash* character `\` in a string. 
The backslash gives a special meaning to the character that follows it. 
`\t` means the tab key on the keyboard, and `\n` is the enter key, called a [*newline*](#glossary_newline). 
Because the backslash `\`  changes the meaning of the character that follows it, to really get a backslash you need to use `\\`.

In [None]:
print('this is the tab "\t" character')
print('this is the enter "\n" character')
print('this is the backslash "\\" character')

You can put a double quote in a double-quoted string by using backslash to escape it, an *escaped* double quote, that can go into a double-quoted string but not end the string. 

In [None]:
print("this is a double-quoted string with a double quote \" in it")

An escaped single quote can be put in a single-quoted string the same way.

In [None]:
print('this is a single-quoted string with a single quote \' in it')

Putting a single quote in the formatted string starting with `f'` and ending with a `'` needs an escaped single quote. 

In [None]:
print(f'this is a formatted string with a single quote \' in it')

#### Operations on strings
Strings can be combined with the `+` add operator. 

In [None]:
s = "Again and "
s = s + "again ..."
print(f's is "{s}"')

The `+=` shortcut works as well.

In [None]:
s = "Again and "
s += "again ..."
print(f's is "{s}"')

Strings can be repeated with the `*` multiplication operator. 

In [None]:
s = "hello " * 3
print(f's is "{s}"')

The `*=` shortcut works as well.

In [None]:
s = "hello "
s *= 3
print(f's is "{s}"')

#### Characters in strings
Characters in strings can accessed with '[' and ']', where *string*`[`*position*`]` gives the character at *position* in the string. 
Python operators and functions count character positions from 0 rather than 1. 
The string can count *position* from the beginning of the string, or if *position* is negative, it can count from the back of the string. 

In [None]:
s = "A red hat"
print(f's starts as "{s}"')
print("positions:   012345678")
t = s[0]
u = s[3]
v = s[-1]
print(f's[0] is "{t}"')
print(f's[3] is "{u}"')
print(f's[-1] is "{v}"')

#### Slices
Strings that are part of other strings are *slices* or *substrings*. 
The slice operator looks like *variable*`[`*start*`:`*end*`]` where the slice has all the characters from the character in the *start* position to the character *just before* the *end* position. 

A slice that has no *start* starts from the beginning of the string. 
A slice that has no *end* goes to the end of the string. 

In [None]:
s = "A red hat"
print(f's starts as "{s}"')
print("positions:   012345678")
t = s[:1]
print(f's[:1] is "{t}"')
t = s[2:5]
print(f's[2:5] is "{t}"')
t = s[6:]
print(f's[6:] is "{t}"')

#### `len` function
The `len` function gives the value of the string length.

In [None]:
s = "A red hat"
length = len(s)
print(f'len("{s}") is {length}')

#### `find` function
The `find` function gives the first position where you can find a substring in a string. 
The link https://docs.python.org/3/library/stdtypes.html#string-methods gives a full list of string functions.

In [None]:
s = "A red hat"
print(f's starts as "{s}"')
print("positions:   012345678")
x = s.find("red")
print(f's.find("red") is {x}')

#### `count` function
The `count` function will count how many times a substring can be found in a string.

In [None]:
s = "the cat in the hat is at the store."
print(f's starts as "{s}"')
x = s.count("the")
print(f's.count("the") is {x}')
x = s.count("at")
print(f's.count("at") {x}')

#### `upper` function
The `upper` function makes all alphabet characters in a string into capitals or [*uppercase*](#glossary_uppercase).

In [None]:
s = "the cat in the hat is at the store"
print(f's starts as "{s}"')
t = s.upper()
print(f's.upper() is "{t}".')

#### `lower` function
The `lower` function makes all alphabet characters  in a string into small letters or [*lowercase*](#glossary_lowercase).

In [None]:
s = "THE CAT IN THE HAT IS AT THE STORE"
print(f's starts as "{s}"')
t = s.lower()
print(f's.lower() is "{t}"')

#### `replace` function
The `replace` function will find the substring of a string that matches a *pattern*, and replace it with another string.   

In [None]:
s = "the red hat"
print(f's starts as "{s}"')
t = s.replace("red", "blue")
print(f's.replace("red", "blue") is "{t}"')

#### `str` function 
If you have a number, the `str` function will let you use it as a string. This is taking str() of an integer.

In [None]:
x = 5
s = str(x)
print(f'str({x}) is "{s}"')

This is taking str() of a real number.

In [None]:
x = 3.1
s = str(x)
print(f'str({x}) is "{s}"')

#### `int` function
If you have a string but would like to use it as a number, the `int` function will make an integer out of it.

In [None]:
s = "5"
x = int(s)
print(f'int("{s}") is {x}')

If the string is not a number, you get an error.

In [None]:
s = "invalid"
x = int(s)
print(f'int("{s}") is {x}')

#### `float` function
If you have a string but would like to use it as a number, the `float` function will make a real number out of it.

In [None]:
s = "3.1"
x = float(s)
print(f'float("{s}") is {x}')

#### `input` function
The `input` function can print a string on the console and wait for you to type some text, then give the text you typed as the value.

In [None]:
s = input('Enter some text: ')
print(f'The text you typed is "{s}"')

#### `strip` function
The `strip` function removes a *newline* at the end of a string. Lines read from a text file will end with a newline. Printing a file with a newline will add an extra blank line.

<img src="strip1.jpg" width="400">

In [None]:
s = "line from file\n"
print(s)
print(s.strip())
print("end of printing")

#### `Format` function
The `format` function can make output look just the way you want. 
The format function looks like this.

&nbsp;&nbsp;&nbsp;&nbsp;`'{:`*format* `:`*format* `...}.format(`*item*, *item*`)`

where each *format* is listed below. 

This is an example of a `format` output that includes more than one *format*.

&nbsp;&nbsp;&nbsp;&nbsp;`'{:17s}  ${:7.2f}'.format(name, total)`

The `:3d` format will make a string out of an integer where the string is at least 3 characters wide, with spaces on the left if the integer is less than 3 characters long.

In [None]:
s ='{:3d}'.format(2)
t ='{:3d}'.format(65)
u ='{:3d}'.format(138)
print(f'"{s}" is the number 2 with format ":3d"')
print(f'"{t}" is the number 65 with format ":3d"')
print(f'"{u}" is the number 138 with format ":3d"')

The `:<3d` format will make a string out of an integer where the string is at least 3 characters wide, with spaces on the right if the integer is less than 3 characters long.

In [None]:
s ='{:<3d}'.format(2)
t ='{:<3d}'.format(65)
u ='{:<3d}'.format(138)
print(f'"{s}" is the number 2 with format "<:3d"')
print(f'"{t}" is the number 65 with format "<:3d"')
print(f'"{u}" is the number 138 with format "<:3d"')

The `:^3d` format will make a string out of an integer where the string is at least 3 characters wide but with the number in the middle if the integer is less than 3 characters long.

In [None]:
s ='{:^3d}'.format(2)
t ='{:^3d}'.format(65)
u ='{:^3d}'.format(138)
print(f'"{s}" is the number 2 with format "^:3d"')
print(f'"{t}" is the number 65 with format "^:3d"')
print(f'"{u}" is the number 138 with format "^:3d"')

For real numbers, the `:2f` format will make a string out of a real number where the string has up to 2 numbers in the fraction, with zeros on the right if the fraction of the real number has less than 2 digits.

In [None]:
s ='{:.2f}'.format(4)
t ='{:.2f}'.format(1.9)
u ='{:.2f}'.format(8.26)
print(f'"{s}" is the 4 with format ":.2f"')
print(f'"{t}" is the 1.9 with format ":.2f"')
print(f'"{u}" is the 8.26 with format ":.2f"')

For strings, he `:5s` format will make a string at least 5 characters wide but with spaces on the right if the string is less than 3 characters long. 
The `:>5s` will make the string at least 5 characters long but with spaces on the left, and `^5s` will make the string at least 5 characters long but in
the center of the string.

In [None]:
s ='{:5s}'.format("one")
t ='{:>5s}'.format("one")
u ='{:^5s}'.format("one")
v ='{:^5s}'.format("eight")
print(f'"{s}" is the "one" with format ":5s"')
print(f'"{t}" is the "one" with format ":>5s"')
print(f'"{u}" is the "one" with format ":^5s"')
print(f'"{v}" is the "eight" with format ":5s"')

## Compound Data Types
There are *compound data types* that are collections of multiple data values that can be assigned to variables. 
Two of these are *lists* and *dictionaries*, and are used to store things like grocery lists and telephone books that collect items of basic data types together. 

Variable names for compound data types will be:
- `L`, `M`, `N`, and so on will be lists
- `D`, `E`, `F`, and so on will be dictionaries

### Lists
The simplest compound data type are *lists*. 
Lists collect a series of other data type values in order between square brackets `[` and `]`. 
As with strings, the first element is 0 rather than 1.

In [None]:
L = [5, 9, 1]
print(f'L is {L}')

The list can be shown like this.

![list1.jpg](list1.jpg)

#### Operations on lists
Lists can be combined with the `+` add operator. 

In [None]:
L = [5, 9, 10]
L = L + [4, 7]
print(f'[5, 9, 10] + [4, 7] is {L}')

The `+=` shortcut works as well.

In [None]:
L = [5, 9, 10]
L += [4, 7]
print(f'[5, 9, 10] + [4, 7] is {L}')

Lists can be repeated with the `*` multiplication operator. 

In [None]:
L = [1, 2] * 3
print(f'[1, 2] * 3 is {L}')

The `*=` shortcut works as well.

In [None]:
L = [4, 5]
L *= 2
print(f'[4, 5] * 2 is {L}')

#### Items in lists
Each list item value can be accessed  with `[` and `]`, where *list*`[`*position*`]` gives the item at *position* in the list. 
List operators and functions count item positions from 0 rather than 1.
The list can count *position* from the beginning of the list or if *position* is negative, it can count from the back of the list. 

In [None]:
L = [5, 9, 1]
x = L[0]
y = L[-2]
z = L[2]
print(f'L is {L}')
print(f'L[0] is {x}')
print(f'L[-2] is {y}')
print(f'L[2] is {z}')

The value of an list item can be changed in the list by assigning an item with its position in square brackets `[` and `]`. 
Like before, positions start from 0. 

In [None]:
L = [5, 9, 1]
print(f'L starts as {L}')
L[0] = 4 
L[-2] = 6 
L[2] = 2 
print(f'L after changing is {L}')

#### Lists and variables
There is not room to store the whole list in a variable, so the variable stores a *pointer* to the list. 
That can be shown like this. 

![list2.jpg](list2.jpg)

When a list is assigned to a variable and that variable is assigned to another variable, the entire list isn't copied into the second variable. 
Only the *pointer* is copied.

![list3.jpg](list3.jpg)

We can show that when both variables are pointing to the same list, when we change the list through the first variable we will see the change when looking through the second variable.

In [None]:
L = [5, 9, 1]
print(f'L starts as {L}.')
M = L
print(f'after M = L, M is {M}.')
L[1] = 2
print(f'after L[1] = 2, L is {L} and M is {M}.')

<img src="list4.jpg" width="400">

#### `copy` function
To assign a list to another variable and be able to change the list through the second variable without affecting the first list, the `copy` function makes a copy of the list before assigning it to the second variable. 
Changing the second list then doesn't change the first list.

In [None]:
L = [5, 9, 1]
print(f'L starts as {L}.')
M = L.copy()
print(f'after M = L, M is {M}.')
L[1] = 2
print(f'L after L[1] = 2, L is {L} and M is {M}.')

![list5.jpg](list5.jpg)

#### Slices
Lists that are part of other lists are *slices* or *sublists*. 
The slice operator looks like *variable*`[`*start*`:`*end*`]` where the slice are all the items from the item in the *start* position to the item *just before* the *end* position. 
The slice operator does not affect the original list, only the list slice is given as the slice
value.

A slice that has no *start* starts from the beginning of the list. 
A slice that has no *end* goes to the end of the list. 

In [None]:
L = [5, 19, 1, 10, 8, 21]
print(f'L is {L}')
x = L[:1]
print(f'L[0:1] is {x}')
x = L[:3]
print(f'L[:3] is {x}')
x = L[4:]
print(f'L[4:] is {x}')

#### `len` function
The `len` function gives the value of the list length.

In [None]:
L = [5, 9, 1]
x = len(L)
print(f'len({L}) is {x}')

#### `append` function
New items can be added at the end of the list with the `append` function.

In [None]:
L = [5, 9, 1]
print(f'L starts as {L}')
L.append(15)
print(f'after L.append(15), L is {L}')

When the `append` function is called for a list, 
the way it is called is *list*.`append(`*item*`)`. 
The list is still given to `append` as an argument even though it is not passed between `(` and `)`. 
When a list is given as an argument to a function, the pointer is given to the function.
Those functions then can use the pointer to change the list,
then the list will still be changed when the function is done. 

<img  src="list6.jpg" width="400">

#### `insert` function
Items can be added in the beginning or the middle of the list with the `insert` function. 
The position where the item is added is the first argument to `insert`, starting from 0, and the item to add is the second argument. 
All the items after the one added are moved down the list. 

In [None]:
L = [5, 9, 1]
print(f'L starts as {L} and the length is {len(L)}. L[2] is {L[2]}.')
L.insert(2, 4)
print(f'L.insert(2, 4) is {L} and the length is {len(L)}. L[2] is {L[2]}.')

This shows `L.insert(2, 4)`.

<img  src="list7.jpg" width="400">

In [None]:
L = [5, 9, 1]
print(f'L starts as {L} and the length is {len(L)}. L[2] is {L[2]}.')
L.insert(0, 7)
print(f'L.insert(0, 7) is {L} and the length is {len(L)}. L[2] is {L[2]}.')

This shows `L.insert(0, 7)`.

<img src="list8.jpg"  width="400">

#### `pop` function
An item at some position can be removed with the `pop` function. 
The position of the item to remove is the first argument to `pop`, starting from 0. 
All items after the one removed are moved up the list.
The item removed is the value of `pop`.

In [None]:
L = [5, 9, 1]
print(f'L starts as {L} and the length is {len(L)}.')
x = L.pop(1)
print(f'L after L.pop(1) is {L} and the length is {len(L)}. The item removed is {x}.')

This shows `L.pop(1)`, which gives the 9 removed as the value.

<img src="list9.jpg" width="400">

#### `remove` function
The first item having some value can be removed from the with the `remove` function. 
The value of the item to remove is the first argument to `remove`. 
Only the first item having the value will be removed.
All items after the one removed are moved up the list.

In [None]:
L = [5, 9, 1, 9]
print(f'L starts as {L} and the length is {len(L)}')
L.remove(9)
print(f'L after L.remove(9) is {L} and the length is {len(L)}.')

This is removing `9` a second time.

In [None]:
L.remove(9)
print(f'L after L.remove(9) is {L} and the length is {len(L)}.')

This shows calling `L.remove(9)` twice.

<img src="list11.jpg" width="400">

#### `extend` function
A list can be added to the end of a list with the `extend` function, like the `+` operator. 
The list to add at the end of the first list is the argument to `extend`.

In [None]:
L = [5, 9, 1]
print(f'L starts as {L} and the length is {len(L)}')
M = [2, 4]
L.extend(M)
print(f'after L.extend(M), M is {M}, L is {L} and len(L) is {len(L)}.')

This shows `L.extend(M)`.

<img  src="list10.jpg" width="700">

#### `sum`, `min`, `max` functions

Lists that are just numbers have functions to give the sum, minimum, and maximum. 

In [None]:
L = [4, 1, 8, 2]
print(f'sum({L}) = {sum(L)}')
print(f'min({L}) = {min(L)}')
print(f'max({L}) = {max(L)}')

#### `sort` function
Lists can be sorted with the `sort` function into increasing number or alphabetic order.
This shows sorting a list of numbers.

In [None]:
L = [4, 1, 8, 2]
print(f'L starts as = {L}')
L.sort()
print(f'after L.sort() L = {L}')

This shows `L` after `L.sort()`.

<img  src="list12.jpg" width="500">

This shows sorting a list of strings.

In [None]:
L = ["horse", "cat", "ox", "dog"]
print(f'L starts as = {L}')
L.sort()
print(f'after L.sort() L = {L}')

This shows `L` after `L.sort()`.

<img  src="list13.jpg" width="400">

#### `reverse` function
Lists items can be reversed with the `reverse` function.

In [None]:
L = [4, 1, 8, 2]
print(f'L starts as = {L}')
L.reverse()
print(f'after L.reverse() L = {L}')

This shows `L` after `L.reverse()`.

<img src="reverse1.jpg" width="400">

#### Lists from strings: `split` function
Lists can be created from strings with the `split` function.
Using `split` with no arguments splits a string at spaces.

<img src="split1.jpg" width="400">

In [None]:
s = "The cat in the hat"
print(f's starts as = "{s}"')
L = s.split()
print(f'after s.split() L is {L}')

`split` can split strings separated by a particular string given as a second argument. 

In [None]:
s = "first,second,third"
print(f's starts as = {s}')
L = s.split(",")
print(f'after s.split(",") L is {L}')

#### Strings from lists: `join` function
Strings can be created from list items by joining the list items separated by a particular string.

In [None]:
L = ["first", "second", "third"]
print(f'L starts as {L}')
s = " ".join(L)
print(f'" ".join(L) is "{s}"')
L = ["O", "N", "E"]
print(f'L starts as {L}')
s = "".join(L)
print(f'"".join(L) is "{s}"')

This shows `" ".join(L)`.

<img src="join1.jpg" width="400">

### Dictionaries
*Dictionaries* are a compound data type similar to *lists*. 
In a dictionary you look up a value using another value, a *key*, rather than looking up a value by position as in a list. 
This is like a real dictionary, you look up a definition of a word using the word. 

In [None]:
D = {"leaf": "green", "sky": "blue", "apple": "red"}
print(f'D is {D}')

The dictionary can be shown like this.

![dict1.jpg](dict1.jpg)

#### Items in dictionaries
Each dictionary item can be accessed with '[' and ']', where *dictionary*`[`*key*`]` gives the value that is assigned for that key in the dictionary. 

In [None]:
D = {"leaf": "green", "sky": "blue", "apple": "red"}
leaf = D["leaf"]
apple = D["apple"]
print(f'D["leaf"] is {leaf}')
print(f'D["apple"] is {apple}')

The value of a dictionary item for a key can be changed by assigning that item with the key in square brackets `[` and `]`. 

In [None]:
D["apple"] = "honeycrisp"
print(f'after D["apple"] = "honeycrisp", D["apple"]  is "{D["apple"]}"')

New dictionary items can be created by assigning a value with a new key in square brackets `[` and `]`. 

In [None]:
D = { "leaf": "green", "sky": "blue", "apple": "red"}
print(f'D starts as {D} length {len(D)}')
D["earth"] = "brown"
print(f'after D["earth"] = "brown", D is {D} length {len(D)}')

#### Dictionaries and variables
There is not room to store whole dictionary in a variable, so the variable stores a *pointer* to the dictionary. 
That can be shown like this. 

![dict2.jpg](dict2.jpg)

When a dictionary is assigned to a variable and that variable is assigned to another variable, the entire dictionary isn't copied into the second variable. 
Only the pointer is copied.

![dict3.jpg](dict3.jpg)

We can show that when both variables are pointing to the same dictionary, when we change the dictionary through the first variable, and we will se the change looking through the second variable.

In [None]:
D = { "leaf": "green", "sky": "blue", "apple": "red"}
print(f'D starts as {D}.')
E = D
print(f'after E = D, E is {E}.')
D["apple"] = "yellow"
print(f'after D["apple"] = "yellow", D is {D} and E is {E}.')

<img src="dict4.jpg" width="400">

#### `copy` function
To assign a dictionary  to another variable and be able to change the dictionary through the second variable without affecting the first dictionary , the `copy` function makes a copy of the dictionary before assigning it to the second variable. 
Changing the second dictionary then doesn't change the first dictionary.

In [None]:
D = { "leaf": "green", "sky": "blue", "apple": "red"}
print(f'D starts as {D}')
E = D.copy()
print(f'after E = D.copy(), E is {E}.')
D["apple"] = "yellow"
print(f'after D["apple"] = "yellow", D is {D} and E is {E}')

![dict5.jpg ](dict5.jpg)

#### `len` function
The `len` function has the value of the dictionary length.

In [None]:
D = {"leaf": "green", "sky": "blue", "apple": "red"}
print(f'D starts as {D}')
x = len(D)
print(f'{len(D)} is {x}')

#### `keys` function
The `keys` function gives a list of the dictionary keys. The keys are not actually in `list` form, you need the `list` function to convert them to a list.

In [None]:
D = {"leaf": "green", "sky": "blue", "apple": "red"}
print(f'D starts as {D}')
L = list(D.keys())
print(f'D.keys() is {L}')

#### `pop` function and `del` operator
The `pop` function can remove an item from a dictionary. 
The key of the value to remove is the first argument to `pop`. 
The dictionary is one value smaller after `pop`. 
The `del` operator can remove an item too.

In [None]:
D = {"leaf": "green", "sky": "blue", "apple": "red"}
print(f'D starts as {D}')
D.pop("leaf")
print(f'after D.pop("leaf"), D is {D}')
del D["sky"]
print(f'after del D["sky"], D is {D}')

## Instruction control

Programs may not want to run instructions just in the order they are given. 
You can run instructions only if something is true, repeat instructions while something is true, or for each item in a group. 
Examples are `if`, `while`, and `for` statements.

### `If` statements
An `if` statement can run instructions if something is true.
An `if` statement looks like:


`if` *condition*`:` \
 &nbsp;&nbsp;&nbsp;&nbsp;*some instruction* \
 &nbsp;&nbsp;&nbsp;&nbsp;*another instruction* \
 &nbsp;&nbsp;&nbsp;&nbsp;*and more...* \
*instruction run after previous instructions*

<img src="if3.jpg" width="400">

An `if` statment tests a *condition*, like `x < 5`, and if true runs the indented instructions after it. 
The next un-indented instruction is run whether or not the `if` *condition* is true. 
The indents have to be one or more spaces, and must be the same for each instruction. 
Here the first test is true and a line is printed. 
The second test is not true so no line is printed for that. 
The last line is always printed.

In [None]:
x = 5
if x > 3:
    print("x is more than 3")
if x < 2:
    print("x is less than 2")
print("after x tests")

### Conditions
Tests for `if` statements may use different operators or functions. An operator for a condition might be `>` in `x > 3`. 

In [None]:
x = 5
if x > 3:
    print(f'x is > 3')

In other cases a function might be used in the condition. 

In [None]:
s = "hello"
if len(s) > 3:
    print(f'len(s) is > 3')

#### Combining conditions
Sometimes a condition might want to test more than one thing. 
An example is wanting a number to be between a lower and higher value. 
The `and` operator can do more than one test at a time.

In [None]:
x = 4
print(f'x starts as {x}')
if x > 2 and x < 6:
    print(f'{x} is between 2 and 6')
if x > 5 and x < 9:
    print(f'{x} is between 5 and 9')

The `or` operator can test if either one test or another is true. 
Here the second test is true because when x is 4, it is less than 9 even if it is no more than 5.

In [None]:
x = 4
print(f'x starts as {x}')
if x > 2 or x < 6:
    print(f'{x} is more than 2 or less than 6')
if x > 5 or x < 9:
    print(f'{x} is more than 5 or less than 9')

The `not` operator can test whether a condition is false.

In [None]:
x = 3
print(f'x starts as {x}')
if x > 2:
    print('x {x} is greater than 2')
if not x < 1:
    print('x {x} is not less than 1')

#### Tests with numbers

There are lots of condition operators for numbers. 
`==` tests if numbers are the same.

In [None]:
s = input("type a number to test:")
x = int(s)
if x == 3: 
    print("{x} is 3")

`>` tests if a number is more than another.

In [None]:
s = input("type a number to test:")
x = int(s)
if x > 3: 
    print("{x} is greater than 3")

`<` tests if a number is less than another.

In [None]:
s = input("type a number to test:")
x = int(s)
if x >= 3: 
    print("{x} is greater than or equal to 3")

`>=` tests if a number is the same or more than another.

In [None]:
s = input("type a number to test:")
x = int(s)
if x < 3: 
    print("{x} is less than 3")

`<=` tests if a number is the same or less than another.

In [None]:
s = input("type a number to test:")
x = int(s)
if x <= 3: 
    print("{x} is less than or equal to 3")

`!=` tests if numbers are not the same.

In [None]:
s = input("type a number to test:")
x = int(s)
if x != 3: 
    print("{x} is not 3")

#### Tests with strings
The `==` operator can test whether strings are the same. 

In [None]:
s = input(f'type a string to test against "test":')
t = "test"
if s == t: 
    print(f'"{s}" is "{t}"')

The `!=` operator can test whether strings are different.

In [None]:
s = input(f'type a string to test against "test":')
t = "test"
if s != t: 
    print(f'"{s}" is not "{t}"')    

The `in` operator can test whether a string is part of another.

In [None]:
s = input(f'type a string to test against "test":')
t = "test"
if t in s: 
    print(f'"{s}" is part of "{t}"')    

There a number of tests for the types of characters in a string using functions.
`isalpha(`*string*`)` tests whether all characters in a string are letters between "a" and "z" or between "A" and "Z".

In [None]:
s = input(f'type a string to test:')
if s.isalpha(): 
    print(f'"{s}" has all letters.')

`isdigit(`*string*`)` tests whether all characters in a string are numbers.

In [None]:
s = input(f'type a string to test:')
if s.isdigit(): 
    print(f'"{s}" has all digits.')

`isalnum(`*string*`)` tests whether all characters in a string are numbers or letters.

In [None]:
s = input(f'type a string to test:')
if s.isalnum(): 
    print(f'"{s}" has all numbers or letters.')

`isspace(`*string*`)` tests whether all characters in a string are spaces .

In [None]:
s = input(f'type a string to test:')
if s.isspace(): 
    print(f'"{s}" has all spaces.')

#### Tests with lists
The `in` operator tests if an item is in a list.

In [None]:
L = [1, 3, 9]
print(f'L starts as {L}')
if 3 in L:
    print(f'3 is in {L}')

#### Tests with dictionary
The `in` operator tests if a key is in a dictionary.

In [None]:
D = {"sky": "blue", "leaf": "green"}
print(f'D starts as {D}')
if "sky" in D:
    print(f'"sky" is in {D}')

### `else` statements

You can test both whether a condition is true or false in one `if` statement with `else`. 
The instructions after the `if` are run if the condition is true and the instructions after `else` are run if the condition is not true.

`if` *condition`:`\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
`else:`\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*

<img  src="if1.jpg" width="500">

In [None]:
s = input(f'type a string to test:')
if s.isalpha(): 
    print(f'isalpha("{s}") are all letters.')
else: 
    print(f'isalpha("{s}") is not all letters.')    

### `elif` statements
You can test a number of conditions in one `if` statement with `elif`. 
`elif` lets you test one condition after another. 
The instructions after the `if` are run if the condition is true and runs the instructions after the first `elif` when that condition is true. 
There can be an `else` at the end that runs if none of the conditions are true.

`if` *condition*`:`\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
`elif` *condition*`:`\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
`elif` *condition*`:`\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
`else:`\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*

<img src="if2.jpg" width="600">

In [None]:
s = input("type a number to test:")
x = int(s)
if x > 9: 
    print("{x} is greater than 9")
elif x >= 3: 
    print("{x} is between 3 and 9")
else: 
    print("{x} is less than 3")

### `while` loop
`if` statements let you run some instructions one time if some condition is true. 
`while` statements let you run some instructions multiple times until a condition is false. 
A while loop looks like:

`while` *condition*`:`\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*

<img src="while1.jpg" width="400">

There is one catch: the condition has to use some variables, and the instructions have to change the value of the variables so that at some time the condition becomes false. 
Otherwise the program will run those instructions forever. 
In this loop, `i` starts below 6, but goes up 1 each time so that it eventually becomes 6 and the loop stops.

In [None]:
x = 2
while x < 6:
    print(f'x is {x}')
    x += 1

This `while` loop can convert pounds to kilograms until you enter 0.

In [None]:
s = ""
while s != "0":
    s = input("give the number of pounds to convert to kilograms, type 0 to stop:")
    x = int(s)
    print(f'{x} pounds is {x * 2.2} kilograms')

### `for` loop

The `for` loop is like a `while` loop except that it runs instructions once for each item in a group. 
A `for` loop looks like this.

`for` *item* `in` *group*`:`\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*\
&nbsp;&nbsp;&nbsp;&nbsp;*instruction*

<img src="for1.jpg" width="600">

This `for` loop runs for each character in a string.

In [None]:
print("a for loop for a string")
s = "abcd"
for c in s:
    print(f'the next character of "{s}" is "{c}"')

This `for` loop runs for each item in a list.

In [None]:
print("a for loop for a list")
L = ["first", "second", "third"]
for i in L:
    print(f'the next item in "{L}" is "{i}"')

### `Range` function
It's often useful to run a function some number of times.
`for` loops often run for each number in a group of numbers. 
The `range` function creates a list of numbers for the `for` loop to run on. 
A `for` loop using `range(n)` with one argument loops through the numbers `0` to `n-1`. 

In [None]:
for i in range(4):
    print(f'the next number in range(4) is {i}') 

Using `range(m, n)` with two arguments will loop through the numbers `m` to `n-1`.

In [None]:
for i in range(2, 5):
    print(f'the next number in range(2, 5) is {i}') 

Using `range(m, n, step)` can add `step` each time until the number is more than `n-1`.

In [None]:
for i in range(3, 9, 2):
    print(f'the next number in range(3, 9, 2) is {i}') 

Using `range(m, n, -1)` the loop can count down from `m` to `n+1`.

In [None]:
for i in range(5, 2, -1):
    print(f'the next number in range(5, 2) is {i}') 

### Computing a square root

Python has a `sqrt` function to compute a square root of a number.

In [None]:
import math
x = int(input('Give the number you want the square root of:'))
print(f'the square root of {x} is {math.sqrt(x)}')

`sqrt` is one of the functions Python supplies for common tasks to save us time programming. 
`import math` says we will use the `math` *module* where a module is a group of functions. 
We call `math.sqrt` to tell Python that `sqrt` is part of the `math` module. 

We can compute the square root ourselves by using a formula. 
If we make a *guess* of the square root and divide the number by it, if the guess is right we will get the same number back.

In [None]:
x = 9
y = 9/3

print(f'making a guess that 3 is the square root of {x}, the result of dividing {x} by 3 is {y}')

If the guess is wrong, the result of dividing will be different, but the right guess will be somewhere between the two.

We'll try an idea for the program. 
- If we have a number, we make a guess and divide it into the number. 
- If the guess and the result of dividing it into the number are the same, we've found the square root.
- If they are not the same, we can try a new guess that is the number halfway between the guess and the result.

The number halfway between two numbers is the average, or $\frac{guess + \frac{guess}{number}}{2}$.

In [None]:
x = 9
y = 2
z = x/y
print(f'dividing {x} by 2 gives {y}, and the average of the two is {(y + z) / 2}')

It turns out the second step is not as easy as we think.
The guess will almost always be a real number, and those two numbers might be very close but still not the same.
Instead of the being the *same*, we will test whether their difference is less than some small number, say `0.001`.
We'll also use some different variable names to try to keep track of things better.
We'll try half of the number as the first guess.

In [None]:
import math
x = int(input('Give the number you want the square root of:'))
guess = x / 2
next_guess = (x / guess) / 2
tries = 0
while abs(guess - next_guess) > 0.001:
    tries += 1
    guess = next_guess
    next_guess = (guess + x / guess) / 2
    print(f'try {tries}, the guess is {round(guess, 4)}, the next guess is {round(next_guess, 4)}, and the difference is {round(abs(guess - next_guess), 4)}')
print(f'the final guess is {round(guess, 4)} and the square root from math.sqrt({x}) is {round(math.sqrt(x), 4)}')

Try a large number and see how many tries it takes. 
It usually won't take that many.
The program did not take many instructions.

### Finding the factors of a number

The factors of a number are those that divide evenly into the number. 
If we have a factor, then the number divided by the factor has a remainder of `0`.
We can find the remainder with the *modulo* operator `%`.

We can test that we have all the factors in a `while` loop.
We know we have all the factors when dividing all of them into the number gives `1`.
A factor may occur more than once, so we need a `while` loop inside the `while` loop to test each factor more than once.
'1' is always a factor of the number, so we'll start testing with `2`.

In [None]:
number = int(input('Give the number you want the factors of:'))
factor = 2
factors = []
x = number
while (x != 1):
    while x % factor == 0:
        factors.append(factor)
        x = x / factor
    factor += 1
print(f'the factors of {number} are {factors}')

### Finding prime numbers
The only factors a prime numbers have are 1 and the number. 
Prime numbers are very useful for cryptography and are good to know. 
We will use the factor example above as a *function* to find this. 
A number is prime when there are only two factors, 1 and the number itself.
You will give us the largest number to test as prime

In [None]:
def factors(number):
    factors = []
    factor = 2
    x = number
    while (x != 1):
        while x % factor == 0:
            factors.append(factor)
            x = x / factor
        factor += 1
    return factors

largest_prime = int(input('Give the largest prime number you want the search for:'))
primes = []
for i in range(2, largest_prime + 1):
    factors_of_i = factors(i)
    if len(factors_of_i) == 1:
        primes.append(i)
print(f'the prime numbers up to of {largest_prime} are {primes}')

Another way to test for prime numbers is the *sieve of Erathsenes*, named after the person who invented the method. 
You start by assuming all numbers up to the largest number are prime.
For numbers less than the prime, you know that two times the number, three times the number, and so on up to the largest prime are not prime because they have at least two factors.
For each number, you then go through the list and mark the number as not prime.
You do not need to test even numbers after 2, because 2 will mark all even numbers after it.
We use a list comprehension to create a list of True's las long as the largest prime number.
We will set each element to False that is a multiple of the number being tested.

In [None]:
largest_prime = int(input('Give the largest prime number you want the search for:'))
numbers = [True for i in range(largest_prime)]
for i in range(2, largest_prime):
    next_nonprime = i * 2
    while next_nonprime <= largest_prime:
        numbers[next_nonprime - 1] = False
        next_nonprime += i
primes = []
for i in range(1, largest_prime):
    if numbers[i]:
        primes.append(i + 1)
print(f'the prime numbers up to {largest_prime} are {primes}')

Try finding the primes up to 30000 with both programs.
The Sieve of Erasthenes should be a lot faster

## List comprehensions

Lists can be made using `for` loops. 
A *list comprehension* has a `for` loop inside `[` and `]` to create the items in a list. 
List comprehensions look like:

`[`*formula* `for` *variable* `in` *group*`]`

where the *formula* gives a value using the *variable* and the *variable* has the value of each item in the group.
This can be an easy shorthand compared to creating the list manually.

In [None]:
L = []
for i in range(5):
  L.append(i)
print(f'L produced manually is {L}')
L = [i for i in range(5)]
print(f'[i for i in range(5)] produces {L}')
L = [2*i for i in range(5)]
print(f'[2*i for i in range(5)] produces {L}')
L = [c for c in "abc"]
print(f'[c for c in "abc"] produces {L}')

If the formula does not use the variable, the list will just repeat the value of the formula.

In [None]:
L = [1 for i in range(5)]
print(f'[1 for i in range(5)] produces {L}')

List comprehensions can make creating many different kinds of lists easier because any formula or `for` loop can be used.

## Dictionary comprehensions

Dictionaries can be made using `for` loops. 
A *dictionary comprehension* has a `for` loop inside `{` and `}` to create the items in a dictionary. 
Dictionary comprehensions look like:

`[`*key*: *value* `for` *variable* `in` *group*`]`

where the *key* gives a value for the key and the *value*  gives the key value, either or both using the *variable*.

In [None]:
keys = ["sky", "leaf"]
print(f'keys starts as {keys}')
values = ["blue", "green"]
print(f'keys starts as {keys}')
colors = {keys[i]: values[i] for i in range(len(keys))}
print('{keys[i]: values[i] for i in range(len(keys))} is ', end="")
print(f'{colors}')

Dictionary comprehensions can make creating many different kinds of dictionaries easier because any formula or `for` loop can be used.

## Functions
When a group of instructions need to be used many times in programs, *functions* save time in programming. 
Functions can take values as *arguments*, and return values if needed. 
Python offers a library of useful functions.
Functions are defined with `def`.
This is an example of a function with no arguments.

In [None]:
def hello():
    print("hello")
print(f'hello() does this:')
hello()

This is an example of a function with arguments.

In [None]:
def print_this(s):
    print(s)
print(f'print_this("hello") does this:')
print_this("hello")    

This is an example of a function that returns a value.

In [None]:
def increase(x):
    return x+1
print(f'increase(5) gives this: {increase(5)}')

Variables can only be used inside a function. 
Variables with the same name inside a function and in the instructions that call them are different. 

In [None]:
def variable_test():
    x = 3
    print(f'in variable_test, after x = 3, x is {x}')
x = 5
print(f'x starts as {x}')
variable_test()
print(f'in the main program after variable_test(), x is {x}')

## Regular Expressions
*Regular expressions* are *patterns* that match parts of strings. 
A regular expression can tell if a string matches a pattern.
Regular expressions are defined inside a Python library called `re`.
`import` allows using a Python library in a program.

The variable name `p` will be used to store patterns.

In [None]:
import re
s = "this needs to be matched"
print(f's starts as "{s}"')
p = "match"
print(f'p starts as "{p}"')
if re.search(p, s):
    print(f're.search(p, s) is True')

A pattern is made of normal string keyboard characters and escape characters, but some characters and escape characters have special meaning in patterns. 
- period `.` matches any character
- the escape character `\.` is needed to match a period, because a period has a special meaning
- the escape character `\d` matches any number character, or *digit*, 0 to 9
- the escape character `\D` is the opposite of `\d` and matches anything that is not a digit
- the escape character `\w` matches any digit or alphabet character a to z, or A to Z
- the escape character `\W` is the opposite of `\w` and matches anything that is not a digit or alphabet character
- the escape character `\s` matches any *space character* which are spaces `" "`, tabs `\t`, or newlines `\n`
- the escape character `\S` is the opposite of `\s` and matches anything that is not a space character

<img src="regex1.jpg" width="300">

In [None]:
import re
s = "there are 12 months in the year"
print(f's starts as "{s}"')
p = "\d\d"
print(f'p starts as "{p}"')
if re.search(p, s):
    print(f're.search(p, s) is True')

There are special characters that control how characters match in a pattern.
- plus `+` means the normal character or escape character just before it will match one or more of that character in the string
- star `*` means the normal character or escape character just before it will match zero or more of that character in the string
- question mark `?` means the normal character or escape character just before it will match if that character is in the string but will still match if it is missing
- caret `^` at the beginning of the pattern means the pattern must match starting at the beginning of the string
- dollar sign `$` at the end of the pattern means the pattern must matsh ending at the end of the string
- to match a plus, star, question mark, caret, or dollar sign, the escape character version must be used, `\+`, `\*`. `\?`, `\^`, or `\$`

In [None]:
import re
s = "there are 12 months in the year"
print(f's starts as "{s}"')
p = "\d+"
print(f'p starts as "{p}"')
if re.search(p, s):
   print(f're.search(p, s) is True')

The pattern to match a date like '*month*/*day*/*year*, where month and day are two digits like "03/05/2011", is `\d\d/\d\d/\d\d\d\d`, where the `\d` are escape characters and slash `/` matches a normal slash.

<img src="regex2.jpg" width="200">

In [None]:
import re
s = "03/05/2011"
print(f's starts as "{s}"')
p = "\d\d/\d\d/\d\d\d\d"
print(f'p starts as "{p}"')
if re.search(p, s):
   print(f're.search(p, s) is True')

There are special characters to control how many times characters will match in a pattern.
- `{`*count*`}` will match the character just before it *count* times
- `{`*min-count*`,`*max-count*`}` will match the character just before it at least *min-count* times but at most *max-count* times
The pattern to match a date can be `\d{2}/\d{2}/\d{4}` this way.

<img src="regex3.jpg" width="200">

In [None]:
import re
s = "03/05/2011"
print(f's starts as "{s}"')
p = "\d{2}/\d{2}/\d{4}"
print(f'p starts as "{p}"')
if re.search(p, s):
   print(f're.search(p, s) is True')

Groups of characters and escape characters can match together using `(` and `)`.

In [None]:
import re
s = "banana"
print(f's starts as "{s}"')
p = "(an){2}"
print(f'p starts as "{p}"')
if re.search(p, s):
   print(f're.search(p, s) is True')

The `|` can be used in a group inside `(` and `)` to let either of two patterns to match.

In [None]:
import re
s = "this has been a long day"
print(f's starts as "{s}"')
p = "a (long|short) day"
print(f'p starts as "{p}"')
if re.search(p, s):
  print(f're.search(p, s) is True')

When you match a pattern with escape characters, you can get the exact part of the string matched with the `group` function.

In [None]:
import re
s = "today 03/05/2011 is March 5th, 2011"
print(f's starts as "{s}"')
p = "\d{2}/\d{2}/\d{4}"
print(f'p starts as "{p}"')
m = re.search(p, s)
if m:
    print(f'after re.search(p, s), m.group() is "{m.group()}"')

By grouping parts of the pattern with `(` and `)`, the `group` function can give the part of the string matched by each group.

<img src="regex4.jpg" width="300">

In [None]:
import re
s = "on 03/05/2011 it rained"
print(f's starts as "{s}"')
p = "(\d{2})/(\d{2})/(\d{4})"
print(f'p starts as "{p}"')
m = re.search(p, s)
if m:
    print(f'after re.search(p, s), m.group() is "{m.group()}"')
    print(f'm.group(1) is the month "{m.group(1)}"')
    print(f'm.group(2) is the day "{m.group(2)}"')
    print(f'm.group(3) is the year "{m.group(3)}"')

Building a regular expression can be tricky, and you can use tools like `regex101.com` to visualize how a given regular expression will match some input text.

### Roman Numerals
The problem is converting Roman numerals like `III` to 3. 
The Roman up to v look like this.
```
4    3    2    1
IV   III  II    I
```

We can use regular expressions to recognize these.
We can show one to three, which are `I`, `II`, or `III`, as 1, 2, or 3 `I`'s.
This can match the regular expression `I{1,3}`.
We can test this using `re.search`.

## Files
Files hold data that programs use.
- Programs can read data from files.
- Programs can write data to files.
- Programs can add data to files that already are there.
A file can be treated like a list by looping through it, which will execute the code for each line in that file.

Each line will have the newline ("\n") at the end, which usually should be strip()-ed out.

### `Open` and `Close` functions
The `open` function gets a program ready to read or write a file.
- `open(`*file-name*`)` gets the program ready to read a file.
- `open(`*file-name*`, 'w')` gets the program ready to write a file.
- `open(`*file-name*`, 'a')` gets the program ready to add to a file that already exists.

`open` returns a *file descriptor* used to read or write to the file. 
The `close` function is used complete reading or writing to a file descriptor.

`files.txt` contains these lines.
​
```
05/20/2021 $20.11 Office supplies
12/01/2021 $123.11 Desk chair
06/14/2021 $50.82 Telephone bill
```

The variable names `f`, `g`, and so on will be used for file descriptors.

In [None]:
f = open("file.txt", "r")
for s in f:
  print(s.strip())
f.close()

# Examples
These are examples of programs that put together what we've learned.

## File processing
One of the most common uses of computers is to process file data. 
Much of the data used in data processing starts as text that can be read in files.
File processing programs read data in files, analyze it somehow, and save the data in some useful way.

### Summarizing groups of records
The program below reads the office supply purchase data from the file `file.txt` used above and prints a summary of the purchases with the total cost.

This example uses regular expressions to separate the different items on each record.
These are regular expressions to pull out the date, cost, and name of supplies.
- date `(\d{2})/(\d{2})/(\d{4})`
- cost `\$(\d+\.\d+)` where `\$` and `\.` are escaped characters because `$` and `.` are special characters
- name of supplies `(.*)` where `.*` matches all the text after the cost

`import re` is needed to start using regular expressions. 
The function `m.group(1)` returns the string matching the first part of the
regular expression in parentheses, here `(\d{2}/\d{2}/\d{4})`.

<img src="regex5.jpg" width="400">

In [None]:
import re
p = "(\d{2}/\d{2}/\d{4}) \$(\d+\.\d+) (.*)"
f = open("file.txt", "r")
total = 0.0
for line in f:
    m = re.search(p, line)
    print()
    print(line.strip())
    if m:
        date = m.group(1)
        cost = m.group(2)
        name = m.group(3)
        total += float(cost)
        print(f'date {date} cost ${cost} name "{name}"')
    else:
        print("    line is not the right format")
f.close()
print()
print(f'total cost: ${total}')

### Summarizing groups of records by type
This example uses regular expressions to add the cost of items in the file and accumulating them by type.
The example uses the file `orders.txt`, which is a data source for purchases of cell phone, cable TV, and internet services.
This example Each line has the format:
- date
- customer
- cost
- service purchased

The program takes the date, name, cost, and service purchased and totals the purchases by name and service type.
`orders.txt` contains these lines.

```
01/05/2022 Colin Heath $70.00 Phone-Cell
01/11/2022 Courtney Collins $14.99 Cable-HBO
01/12/2022 John Johnson $70.00 Phone-Cell
01/17/2022 Colin Heath $50.00 Internet-100MB
02/18/2022 John Johnson $80.00 Internet-1GB
02/21/2022 Courtney Collins $6.99 Cable-Hulu
02/23/2022 Colin Heath $6.99 Cable-Hulu
03/07/2022 Courtney Collins $50.00 Internet-100MB
```

These are regular expressions to pull out the date, cost, and name of supplies.
- date `(\d{2})/(\d{2})/(\d{4})`
- first name `(\w+)`
- last name `(\w+)`
- cost `\$(\d+\.\d+)` where `\$` and `\.` are escaped characters because `$` and `.` are special characters
- name of service `(.*)` where `.*` matches all the text after the cost

These are notes about the program.
- The dictionaries *customers* and *services* are used to hold the total cost for the purchases for each person and each service.
- The function `add_to` totals the cost of an item by customer and service. 
- The first time the cost is assigned to the dictionary entry for the item, because the item is not yet in the dictionary. After that it adds the cost to the total for that dictionary item. 
- The results are printed in a readable way using the string function `format`.
- `import re` is needed to start using regular expressions. 

<img src="regex6.jpg" width="400">

In [None]:
import re

def print_totals(item, dict):
    keys = list(dict.keys())
    keys.sort()
    item_header = '{:17s}'.format(item)
    print(f'{item_header}  costs')
    print("-----------------  --------")
    for key in keys:
        name = key
        total = dict[key]
        print('{:17s}  ${:7.2f}'.format(name, total))

def add_to(customers, name, services, item, cost):
    if name in customers:
        customers[name] += cost
    else:
        customers[name] = cost
    if item in services:
        services[item] += cost
    else:
        services[item] = cost

p = "(\d{2}/\d{2}/\d{4}) (\w+) (\w+) \$(\d+\.\d+) (.*)"
f = open("orders.txt", "r")
customers = {}
services = {}
print('date         name                 service         cost')
print('------------ -------------------- --------------- --------')
for s in f:
    line = s.strip()
    m = re.search(p, line)
    if m:
        date = m.group(1)
        first = m.group(2)
        last = m.group(3)
        full_name = f'{last}, {first}'
        cost = m.group(4)
        cost_float = float(cost)
        service = m.group(5)
        print('{:12s} {:20s} {:15s} ${:7.2f}'.format(date, full_name, service, cost_float))
        add_to(customers, full_name, services, service, cost_float)
    else:
        print("    line is not the right format")
        print(line)
f.close()
print()
print(f'total cost for customers')
print_totals("customers", customers)
print()
print(f'total cost for services')
print_totals("services", services)


### *Joining* two data sources
Often program data comes in two or more data sources. 
These may be in files, or in <a href="#glossary_database">*databases*</a>,
where the data are records containing different data items.
Two data sources may have records that describe the same thing.

This example combines data describing movies in two files.
The example uses the file `movies.txt`, which is a data source for movies and when they were released, and the file `actors.txt`, which is a data source for the actors
in those movies.

Each line of `movies.txt` has the format:
- name of the movie
- date released

Each line of `actors.txt` has the format:
- name of an actor
- a movie they starred in

The program combines the data in both files by taking a movie name from `actors.txt` and finding the movie's release date from `movies.txt`, 
and printing one line with these data items.
- actor
- movie
- release date

The program pulls out the actor, movie, and release date from the records, 
but we want to allow spaces in the movie name so we separate the data items
with `,` instead of a space.

`movies.txt` contains these lines.
```
Forrest Gump,1994
Kate & Leopold,1998
```

`actors.txt` contains these lines.
```
Meg Ryan,Kate & Leopold
Tom Hanks,Forrest Gump
```

The actors dictionary is this.

<img src="actors1.jpg" width="700">

The movies dictionary is this.

<img src="movies1.jpg" width="700">

The actors and movies dictionaries are joined by the movie.

<img src="join2.jpg" width="400">

In [None]:
def print_actors(movies, actors):
    actor_names = list(actors.keys())
    actor_names.sort()
    print(f'actor         movie              release date')
    print("------------  -----------------  ------------")
    for actor in actor_names:
        movie = actors[actor]
        release_date = movies[movie]
        print('{:12s}  {:17s}  {:4s}'.format(actor, movie, release_date))

def add_actor(actors, line):
    items = line.split(",")
    actor = items[0]
    movie = items[1]
    actors[actor] = movie

def add_movie(movies, line):
    items = line.split(",")
    movie = items[0]
    release_date = items[1]
    movies[movie] = release_date

f = open("actors.txt", "r")
actors = {}
for s in f:
    line = s.strip()
    add_actor(actors, line)
f.close()

g = open("movies.txt", "r")
movies = {}
for s in g:
    line = s.strip()
    add_movie(movies, line)
g.close()

print_actors(movies, actors)

## Data Structures
Composite data or *data structures* simplifies much of programming. 
Knowing how to program with algorithms using composite data is one of the basic skills of programming. 

### Sorting a list<a name="algorithm_sorting"></a>
The `sort` function can sort lists, but understanding how it works is a good example for programming.
Suppose we have a list we want to sort from lowest to highest.

<img  src="sort1.jpg" width="300">

One way to sort the list is to start at the bottom and move the lowest number to the top. Here, 3 is swapped with 8, then 3 is swapped with 9.

<img  src="sort2.jpg" width="600">

The lowest number is now at the top. Next start at the bottom of the list and move the second highest number to be second in the list. Here, 8 is swapped with 9.

<img  src="sort3.jpg" width="400">

The second lowest number is in the second position and the list is now sorted. 

The way we program this is to say what we want to do in English first.

- Start with a list *n* long
- *Bubble* the lowest item up to the top of the list
- Bubble the next lowest item up to the second item in the list
- Repeat this until we bubble the second highest number to the second-to-the-last item in the list

We compare the last item in the list to the one above it, and swap them  if the last item is less.
We do this until we compare the second item in the list to the first item in the list.
We then compare the last item in the list to the one above it, and swap them if the last item is less.
We compare and swap items until we compare to the second item in the list.
This says that bubble up to the top of the list, to the second item of the list, until we compare to the second-to-the-last item in the list, or *n-1* times, because there are n-1 items between the top and the second-to-the-last item.

We want a program that looks like this:
```
sort(L):
  n = len(L)
  # bubble the lowest item to the top, then the second, down to the second-to-last spot in the list
  for i in 1 to n-1:
    from j = n (the last item) to i-1 # the spot below the one we're bubbling to
      compare the item at j to the item at j-1
      swap them if the item at j is less than the item at j-1
```      
That is about it, except shifting the spots in the list because lists positions start at 0, not 1. Here is the program as a function, with some prints to track what it's doing.

In [None]:
def sort(L):
    n = len(L)
    print(f'at the start L is {L} and len(L) is {n}')
    for i in range(n-1):
        print(f'  bubbling the next lowest item to the position {i}')
        for j in range(n-1, i, -1):
            print(f'    comparing L[{j}] = {L[j]} to L[{j-1}] = {L[j-1]}')
            if (L[j] < L[j-1]):
                print(f'    swapping  L[{j}] = {L[j]} and L[{j-1}] = {L[j-1]}')
                x = L[j]
                L[j] = L[j-1]
                L[j-1] = x
        print(f'    L at round {i} is {L}')
    
L = [9, 8, 3]
sort(L)
print(f'at the end L is {L}')

The final list matches our diagram. The steps to write this program were:
- coming up with a small test case that shows how the program works
- thinking out an idea of how the program might work, here the key was *bubbling* the lowest element to it's right position each time
- making a diagram of steps as the program runs
- writing the idea in English in a way that the steps are understandable
- writing the Python instructions, printing out variables at various steps as the program runs

One way this program would be useful is that what we define as *lower* items might be more complicated than *the number value of the item is less*. 
The items might be strings, or *less* might be that a "mouse" might be less than a "rabbit", a "rabbit" might be less than a "dog", and so forth.
We would change the comparison `L[j] < L[j-1]` to our own comparison to test if an item is *less* than another. 
We saw how `for` loops can control the bubbling, and that a small program can sort a list of any size.

## Data Cleaning

A common task in Data Science is cleaning raw data to make it usable for further analysis.
Following are types of data that cleaning will correct or remove.

- incorrectly formatted
- duplicate fields or lines
- inconsistent, non-standard, or mislabeled categories or classes 
- corrupted, invalid, inaccuate, irrelevant, outliers, or incomplete
- has typos, misspellings, or syntax errors
- missing codes, data, or fields


### Removing duplicate lines in a file

A common task in data cleaning is removing duplicates in files. 
In this example we have a file with duplicate lines, and we want to remove the duplicates so we just have unique lines in the file. In this case we have a sample data file with some duplicate lines. 
Before removing duplicate lines, the file may need to be [*sorted*](#algorithms_sorting) first.
The program idea is that we look at the *previous line* to decide if a line is a duplicate: 
- case 1: the previous line is the same, then the current line is a duplicate
- case 2: the previous line is not the same, then the current line is new and should be kept (printed back out)

We then see there are two more cases:
- case 3: the first line in the file. There is no previous line, so the current line is new and should be kept.
- case 4: we run out of lines in the file. There is no current line, so there is nothing we have to do.

We can see in this case that:
- the first line is A, line 0 (we always count from 0, this is case 3 so we print it
- the second line, line 1, is B, this is case 2 so we print it
- the third line, line 2, is B, this is case 1 and we ignore it
- the fourth line, line 3, is C, this is case 2 so we print it
- the fifth line, line 4, is D, this is case 2 so we print it
- the sixth line, line 4, is D, this is case 1 so we ignore it

![duplicate1.jpg](duplicate1.jpg)

We can now write the program. `last` is the last line we saw. `first` tells us whether we are at the first line of the file, which is case 3. We read all the lines of the file, and don't need to do anything at the end, just close the file.

Let's say the file `duplicates.txt` is this:
```
A
B
B
C
D
D
```
This is the program. `strip` is used because the line actually has a `newline`, "\n", at the end which we can ignore. 
The line beginning with `#` is a *comment*.
It is for notes we make to about what the program is doing, it is ignored when the program runs.
The first time we read a line, `first` is `True`, we have case 3 and we print the line.
We use `elif` to test the line after the first is read and `first` is set to `False`.
If the test is true, the line is different from the last and we have case 2, and we print the line.
If the test is false, the line is the same as the last and we have case 1, and we ignore the line.

In [None]:
first = True
f = open("duplicates.txt", "r")
for line in f:
    current = line.strip()
    if first:
        first = False
        last = current
        print(current)
    elif current != last:
        print(current)
    # otherwise, if current == last it is a duplicate and we ignore it
    last = current
f.close()

### Another way to remove duplicate lines

If the purpose is to remove duplicates and the order of the lines do not matter, this is another way to remove duplicate lines. 
A dictionary has only unique keys.
When a dictionary is first assigned with a key, a new entry is created.
The second time the dictionary is assigned with the key, no new entry is created but the previous value is replaced.
By using the line that is read as the key, they keys will only keep one copy of it.
We can read the file, assign a dictionary entry with the line as the key with any value, and print the keys at the end.

In this
Let's say the file `duplicates.txt` is this:
```
D
D
A
C
B
B
```
In this case the lines with the duplication removed come out in the same order, but that is not guaranteed. This is show as this.

<img src="duplicate2.jpg" width="600">

In [None]:
f = open("unordered-duplicates.txt", "r")
unique_lines = {}
for line in f:
    current = line.strip()
    unique_lines[current] = 1
f.close()
print('unique lines:')
for line in unique_lines.keys():
    print(line)

### Fixing inconsistent codes

Another data cleaning task is replacing inconsistent spelling with one recognizable value.
"Not applicable" may be used as "NA" or "N/A". 
"Drive" may be "Dr." or "Dr".
A dictionary can have the incorrect spelling as the key and the correct spelling as the value.

Let's say the file `misspellings.txt` is this:
```
Smith,John,N/A,230 Overland Dr.
Jones,Michael,NA,34 Blue Ridge Drive
Lund,Mary,Not applicable,Main St
```

In [None]:
misspellings = {"N/A": "NA", "Not applicable": "NA", "Drive": "Dr", "Dr.": "Dr"}
f = open("misspellings.txt", "r")
for line in f:
    current = line.strip()
    for term in misspellings.keys():
        current = current.replace(term, misspellings[term])
    print(current)
f.close()

# Glossary

<a name="glossary_algorithm"></a>*Algorithms* are general designs for common program tasks, such as organizing, searching for, or transforming data.

<a name="glossary_data"></a>*Data* are terms, facts, measurements, or statistics that when processed become *information*.

<a name="glossary_cpu"></a>*CPU* is the computer *central processing unit* that runs low-level machine instructions.

<a name="glossary_database"></a>*Databases* hold data much like text files,
except that databases can hold types of data other than text, such as images. 
Databases also usually can read records based on the data in them,
such as names, dates, or any other part of the record. 
These are called *keys*.

<a name="glossary_information"></a>*Information* is data that is preprocessed, organized, analyzed, and interpreted.

<a name="glossary_web_lowercase"></a>*Lowercase* letters are characters from `a` to `z`. 

<a name="glossary_newline"></a>*Newline* is a character that is not visible, but causes the printing to start on a new line.

<a name="glossary_text"></a>*Text* is letters, numbers, and punctuation much as you would type on a keyboard. Text may include European letters like &ouml;, &eacute;, &icirc; and &ntilde;.  

<a name="glossary_web_uppercase"></a>*Uppercase* letters are characters from `A` to `Z`. 

<a name="glossary_web_browser"></a>A *web browser* is a program to navigate the Internet, such as Chrome, Edge, Firefox, or Safari.