# Variables

Variables are names that refers to a value. They have a couple obvious and not-so-obvious advantages:
- Store results of operations

In [3]:
city_label = 'SALT LAKE CITY'.title() + ', Utah'
print(city_label)

Salt Lake City, Utah


- Easily reuse values or objects later on in the program
- Can help create self-documenting code

In [None]:
salt_lake_median_income = get_median_income('Salt Lake', census_data)
state_median_income = get_median_income('Utah', census_data)
if salt_lake_median_income > state_median_income:
    print('Salt Lake has a greater median income than the state as a whole')

Variables are created  with the single equal sign `=`:

`var_name = <raw value or method that returns a value>`

Variables can hold values of any "type". Types are the fundamental data classifications of a programming language.

In [None]:
# Text ('str')
name = 'Bryce Canyon'

# Whole numbers ('int')
elevation = 7888

# Decimal numbers ('float')
latitude = 37.640
longitude = -112.170

# True or false ('bool')
cool_hoodoos = True

# Editable list of things ('list')
trails = ['Bristlecone Loop', 'Navajo Loop', 'Peakaboo Loop']

# Look-up tables ('dict')
trail_lengths = {'Bristlecone Loop': 1.0, 'Navajo Loop': 1.3, 'Peakboo Loop': 5.5}


Note that, unlike other languages you may have seen, you don't have to specifically declare the type. Python infers the type based on the value you assign to the variable.

Variables can be reused or overwritten, including in the same line (or statement)

In [4]:
age = 15
print(f'Original age: {age}')
age += 1
print(f'After birthday: {age}')
age = 46
print(f'After many years: {age}')

Original age: 15
After birthday: 16
After many years: 46


# Variable Names

Variable names can contain letters, numbers, and underscores (_), but cannot start with a number.

In [5]:
# Allowed
age = 15
district_5_population = 13541

# Borked
5_city_population = 65309

SyntaxError: invalid token (<ipython-input-5-260386107daa>, line 6)

With only a few exceptions, variable names should be descriptive and unambiguous. You really want to avoid one-letter variable names. 

Which of these two examples is more readable and concise?

In [6]:
#: Population
x = 15434
#: Area
y = 5.67
#: Population density
z = x/y

In [None]:
population = 15434
area = 5.67
population_density = population/area

This is called self-documenting code. We've been using coments (lines starting with `#`) for a while now to add add context to our code, but the best code is written so that it needs as few comments as possible. 

Chances are, you will have to look at your script in 6 months (or 2 years) when you've completely forgotten why you did something. Worse, you'll be handed someone else's code and asked to fix or update it. Well-documented code, both by self-documenting variable names and appropriate comments, is a life saver.

# A Comment on Comments

Comments can be overused. Consider the following:

In [None]:
#: City population
population = 15434
#: City area
area = 5.67
#: Density is population divided by area
population_density = population/area

While true, there's a lot of information duplication here. Just like good cartography is all about knowing what to take off the map, good comments are about only including what is absolutely necessary to understand your program.

You should use comments to explain **why** you did something. The **how** should be evident from the code (including using good variable names).

However, using a comment to summarize a whole block of code can be allowable, but this is also a sign that it could probably be bundled up into a function (more on those later).

In [None]:
#: Calculate population density of only the buildable area
population = 15434
area = 5.67
parks_area = .68
water_area = 1.2
steep_slopes = 2.5
usable_area = area - parks_area - water_area - steep_slopes
population_density = population/usable_area

# Variable Types

## int

Integars ("int") hold whole numbers. As we've seen, you can perform all the standard mathematical operators to them.

## float

Floats are used to represent decimal numbers. They work just fine 99% of the time, [but when they don't](https://www.youtube.com/watch?v=PZRI1IfStY0), they don't. If you ever run into problems, check out the `Decimal` type. Otherwise, the built-in `round()` method can get you close.

In [12]:
.1 + .2


0.30000000000000004

In [13]:
round(.1 + .2, 2)

0.3

You may be working with mixed ints and floats. Most of the time, python will automatically recast the int as a float in the background

## str

Strings ("str") store text data in a program, and we use them all over the place. You've probably already done some string manipulation in the Field Calculator or labeling tools in ArcGIS Pro. We use strings to convey info to our programs's users or write info to a logfile. 

You can use either `'` or `"` to define strings. It's up to you, but try to be consistent. If you need to use the string character in the string itself, you need to escpae it using the `\`. Sometimes, you can use the other quote mark if you need to use the first in your string. 

In [2]:
#: Both work
single_quotes = 'Go for a run'
double_quotes = "Biking is better"

#: Use double qoutes to use a apostrophe
forecast = "Highs in the low 70's"
#: Or single quotes to include a quote in the string
error = 'Variable "temperature" has not been set'

#: Escpaing quotes (also works for other special characters)
report = 'The meteorologist\'s forecast specifically said "The lows won\'t be below 55."'  

## Building Strings

Sometimes you just need to store some pre-defined text, like we've done already. Frequently, though, you'll need to build longer or more complex strings out of other data, including floats and ints.

The `.join()` method can be used to build a string out of a collection of data using a predefined joining string:

In [5]:
activities = ['Hiking', 'Biking', 'Stargazing', 'Loafing']
activity_list = ', '.join(activities)
print(activity_list)

Hiking, Biking, Stargazing, Loafing


Another way to build strings is f-strings ("formatted"strings"). These are particularly useful for combining lots of data from different variables into a single string. 

In [8]:
high_temp = 83
low_temp = 44
precip = 40
pressure = 29.82
pressure_direction = 'climbing'

forecast = f'Today\'s high will be {high_temp} degrees with a {precip}% chance of precipitation. The low tonight will be {low_temp}. The pressure is {pressure} and {pressure_direction}.'
print(forecast)

Today's high will be 83 degrees with a 40% chance of precipitation. The low tonight will be 44. The pressure is 29.82 and climbing.


You can also put more complex expressions in the `{}`s, like method calls or math statements. You can also use them to repeat single characters, which is useful for printing output.

In [10]:
city = 'TROPIC'
banner = f'{"="*10}\n  {city.title()}  \n{"="*10}\n{forecast}'
print(banner)

  Tropic  
Today's high will be 83 degrees with a 40% chance of precipitation. The low tonight will be 44. The pressure is 29.82 and climbing.


All strings have a set of [methods](https://docs.python.org/3/library/stdtypes.html#string-methods) that can be called on them to create a new, modified string:

- `.title()`, `.upper()`, `.lower()`/`.casefold()`: Proper Case, UPPER CASE, or lower case. Useful for comparing strings.
- `.endswith()`, `.startswith()`, .`.find()`: check for specific text in the string
- `.split()`, `.partition()`, `.strip()`: divide the string or remove trailing data

In [11]:
print('11-029-0056'.split('-'))

['11', '029', '0056']


Strings in Python are immutable, which means that once we create a string we can't change it directly. Every string operation creates a new string with the desired changes.

In [1]:
city = 'BRYCE CANYON CITY'
#: Does not modify city, but returns a new value (which we don't assign to a variable)
city.title()
print(city)

BRYCE CANYON CITY


In [2]:
#: Assign the output to a variable
title_case_city = city.title()
print(city)
print(title_case_city)

BRYCE CANYON CITY
Bryce Canyon City


In [3]:
#: overwrite the existing variable
city = city.title()
print(city)

Bryce Canyon City


# Collections

Simple variables store individual values, but we often need to store groups of variables. Ptyhon gives us several built-in tools to do this.

## Lists
Lists are a simple and flexible collection of data. You can put different types of data into the same list.

In [15]:
parcel_ids = ['11-011-0033', 21094510720000, '00-0004-6263']

Items in lists can be referred to by their index, which starts at 0. You can also use negative indexes, which work back from the end.

In [16]:
print(parcel_ids[0])
print(parcel_ids[2])
print(parcel_ids[-1])

11-011-0033
00-0004-6263
00-0004-6263


You can also use slice notation to refer to just portions of a list. You create a slice by specifying the first index you want to include and the index after the last one, separated by a colon: `list_name[start_index:end_index+1]`. From math class, it's  a half-open interval: `[start, end)`. So, `list_name[1:5]` means "give me all the items begining at index 1 up to, but not including, index 5"

You can also ommit the start or end value and Python will interperate it as the beginning or end of the list, respectively: `list_name[:4]` or `list_name[4:]`. These become "everying up to, but not including, index 4" and "everything from index 4 to the end."



In [2]:
letters = ['a', 'b', 'c', 'd', 'e', 'f']
letters[2:5]

['c', 'd', 'e']

In [3]:
letters[:5]

['a', 'b', 'c', 'd', 'e']

In [4]:
letters[2:]

['c', 'd', 'e', 'f']

In [5]:
letters[:]

['a', 'b', 'c', 'd', 'e', 'f']

Because the slice interval is open at the end, and the index starts at 0, we get two cool side effects:

- `list_name[:n]` gives us the first n items in the list. ie, if n were 5 we'd get the first 5 items
- `list_name[n:]` gives us the rest of the items in the list.

So, if we want to split a list at a given index, we just have to do two slices of the original list

In [6]:
split_point = 3
letters[:split_point]


['a', 'b', 'c']

In [7]:
letters[split_point:]

['d', 'e', 'f']

Adding data to a list is easy: `.append()`

In [8]:
letters.append('g')
letters

['a', 'b', 'c', 'd', 'e', 'f', 'g']

However, many functions return a new list of their own. How do we add that to our existing list?

In [9]:
first_list = ['spam', 'eggs']
new_list = ['African', 'European']
#: Just append, right?
first_list.append(new_list)
first_list

['spam', 'eggs', ['African', 'European']]

Now we've got a nested list (which can be useful, but not what we're after...)

`.extend()` is our friend here.

In [10]:
first_list = ['spam', 'eggs']
new_list = ['African', 'European']
#: Just append, right?
first_list.extend(new_list)
first_list

['spam', 'eggs', 'African', 'European']

We can alter and rearrange items in a list using `.sort()`, `.reverse()`, etc.

In [12]:
letters.reverse()
letters

['g', 'f', 'e', 'd', 'c', 'b', 'a']

In [14]:
letters.sort()
letters

['a', 'b', 'c', 'd', 'e', 'f', 'g']

In [15]:
#: what about mixed lists?
many_things = ['Clarkson', 'Hammond', 23, 0.25]
many_things.sort()

TypeError: '<' not supported between instances of 'int' and 'str'

In [16]:
nested_list = ['spam', 'eggs', ['African', 'European']]
nested_list.sort()

TypeError: '<' not supported between instances of 'list' and 'str'

## Dictionaries

Dictionaries are useful for storing a collection of things that can be identified by a unique ID. Each `key` maps to one and only one `value`- kind of like how every feature in a feature class can be identified by its object ID. These are sometimes referred to as look-up tables (or hash tables) because you use one value to get info about another value or set of values.

In [17]:
trail_lengths = {'Bristlecone Loop': 1.0, 'Navajo Loop': 1.3, 'Peakboo Loop': 5.5}
print(trail_lengths['Bristlecone Loop'])
print(trail_lengths['Navajo Loop'])

1.0
1.3


A dictionary's `keys` must be "hashable" - able to be represented by a programmatically calculated value (hash) that never changes over time (immutable). The basic variable types we've looked at already - ints, floats, strings - are all immutable and can be used as keys. 

Because we can change the content of lists, they are mutable and thus can't be used as keys (you also can't use a dictionary as a key of another dictionary)

In [4]:
#: Ints
stations = {1: 'Salt Lake', 2: 'Provo', 3: 'Nephi'}
#: Strings
trail_lengths = {'Bristlecone Loop': 1.0, 'Navajo Loop': 1.3, 'Peakboo Loop': 5.5}

In [6]:
#: Lists: won't work
assigned_counties = {['Jake', 'Josh']: 'Salt Lake', ['Jordan', 'Jason']: 'Davis'}

TypeError: unhashable type: 'list'

A dictionary's `values` can be whatever you want- ints, strings, lists, other objects, just about anything.

In [7]:
county_assignments = {'Salt Lake': ['Jake', 'Josh'], 'Davis': ['Jason', 'Jordan']}
#: Access by key and append
county_assignments['Salt Lake'].append('Steve')
print(county_assignments)

{'Salt Lake': ['Jake', 'Josh', 'Steve'], 'Davis': ['Jason', 'Jordan']}


Nested dictionaries can be used to replicate a table-like structure where the key of the outer dictionary is the row id and the keys of the inner dictionaries are the column names.

In [13]:
counties = {1: {'Name': 'Beaver', 'FIPS': 49001}, 2: {'Name': 'Box Elder', 'FIPS': 49003}}
#: Create a new entry by referencing an unused key
counties[3] = {'Name': 'Cache', 'FIPS': 49005}
print(counties)
#: Accessing nested dicts
print(counties[1]['Name'])

{1: {'Name': 'Beaver', 'FIPS': 49001}, 2: {'Name': 'Box Elder', 'FIPS': 49003}, 3: {'Name': 'Cache', 'FIPS': 49005}}
Beaver


You can use the `.items()` method to iterate over a dictionary (more on iteration later).

In [19]:
for row_number, county in counties.items():
    print(f'=={row_number}==')
    for field, value in county.items():
        print(f'\t{field}: {value}')

==1==
	Name: Beaver
	FIPS: 49001
==2==
	Name: Box Elder
	FIPS: 49003
==3==
	Name: Cache
	FIPS: 49005


Prior to Python 3.7, dictionaries were not guaranteed to preserve the insertion order. It was simply an ordered collection of items. ArcGIS Pro 2.7+ include Python 3.7. 

However, in the effort of self-documenting code and the Zen of Python's "explicit is better than implicit", if you need to rely on dictionary ordering, you should use the `OrderedDict` from the `collections` module

In [12]:
#: First, import the OrderedDict class
from collections import OrderedDict

#: Create an OrderedDict object
d = OrderedDict()

#: Reference keys like normal
d[1] = {'Name': 'Beaver', 'FIPS': 49001}
d[2] = {'Name': 'Box Elder', 'FIPS': 49003}

print(d)
print(d[2])

OrderedDict([(1, {'Name': 'Beaver', 'FIPS': 49001}), (2, {'Name': 'Box Elder', 'FIPS': 49003})])
{'Name': 'Box Elder', 'FIPS': 49003}


In [14]:
counties.items()

dict_items([(1, {'Name': 'Beaver', 'FIPS': 49001}), (2, {'Name': 'Box Elder', 'FIPS': 49003}), (3, {'Name': 'Cache', 'FIPS': 49005})])

Python uses dictionaries for a lot of the "under the hood" stuff to keep track of all the different objects (variables) that have been defined. You can call the `locals()` or `globals()` methods to get dictionaries of all the variables.

In [24]:
#: Doing this in an ipython notebook spits out all the variable associated with the notebook itself.
locals()

{'__name__': '__main__',
 '__doc__': 'Automatically created module for IPython interactive environment',
 '__package__': None,
 '__loader__': None,
 '__spec__': None,
 '__builtin__': <module 'builtins' (built-in)>,
 '__builtins__': <module 'builtins' (built-in)>,
 '_ih': ['',
  "city = 'BRYCE CANYON CITY'\ncity.title()\nprint(city)",
  '#: Assign the output to a variable\ntitle_case_city = city.title()\nprint(city)\nprint(title_case_city)',
  '#: overwrite the existing variable\ncity = city.title()\nprint(city)',
  "#: Ints\nstations = {1: 'Salt Lake', 2: 'Provo', 3: 'Nephi'}\n#: Strings\ntrail_lengths = {'Bristlecone Loop': 1.0, 'Navajo Loop': 1.3, 'Peakboo Loop': 5.5}",
  "#: Lists: won't work\nassigned_counties = {['Jake', 'Josh'] = 'Salt Lake', ['Jordan', 'Jason'] = 'Davis'}",
  "#: Lists: won't work\nassigned_counties = {['Jake', 'Josh']: 'Salt Lake', ['Jordan', 'Jason']: 'Davis'}",
  "county_assignments = {'Salt Lake': ['Jake', 'Josh'], 'Davis': ['Jason', 'Jordan']}\ncounty_assig

## Tuples

Tuples are an unchangable group of objects—once you create a tuple, you can't change it's contents. They are commonly used when you need to quickly package a bunch of data together to move to different parts of your program. Any function/method that returns more than one object implicitely creates a tuple and returns the tuple.

In [1]:
#: Defined by ()s
trails_to_hike = ('Bristlecone', 'Navajo', 'Hat Shop', 'Fairyland')
#: Access through indices like lists
print(trails_to_hike[1])
print(trails_to_hike[2:4])

Navajo
('Hat Shop', 'Fairyland')


In [2]:
#: can't modify
trails_to_hike[1] = 'Swamp Canyon'

TypeError: 'tuple' object does not support item assignment

Tuples can be "unpacked" into multiple variables in a single line of code. This is often used when a function returns multiple objects and you want to assign them meaningful names.

In [3]:
x, y = (45, 2.3)
print(x)
print(y)

45
2.3


In [4]:
def square_and_cube(number):
    square = number ** 2
    cube = number ** 3
    return square, cube

square, cube = square_and_cube(3)
print(square)
print(cube)

9
27


# Iterations

"Insanity is doing the same thing over and over again and expecting different results" - Einstein's pet gerbil. 

A common task is to inspect or operate on every item in a collection. You may want to change the visibility of every layer in a map, or get a value from every feature in a feature class, or check the usage of every AGOL item in your organization. They also allow you to do something repetitively until a condition has been met. 

The `for` loop is probably the most-used loop.

In [6]:
#: for <var_name> in <iterable>
for trail in trails_to_hike:
    print(trail)

Bristlecone
Navajo
Hat Shop
Fairyland


Under the hood, an `iterable` is any object that implements the `__iter__()` method, which returns an `iterator` object that handles the logic of returning the next item in sequence. Lists are a common iterable, but strings are also iterables too!

Sometimes, we need the index of the currrent item. This is usually used for accessing previous values in the iterable. We use the `enumerate()` built-in fucntion, which returns a tuple of the count and the value at each step through the iterable.

In [10]:
#: We want to get the distance between each station
stations = [0, 4, 13, 25, 28, 40]
for i, station in enumerate(stations):
    if i == 0:
        continue
    distance = station - stations[i-1]
    print(distance)

4
9
12
3
12


If you're familiar with other programming langues, you might be used to always using the index to iterate. While you can do this in python using the `range()` and `len()` functions, it's considered unpythonic and should be avoided lest the PEP-8 police come after you.

In [11]:
#: Generally avoided!
stations = [0, 4, 13, 25, 28, 40]
for i in range(0, len(stations)):
    if i == 0:
        continue
    distance = stations[i] - stations[i-1]
    print(distance)

4
9
12
3
12


The `while` loop is useful when you don't know how many times you need to do something

## Breaking the iteration

The example above uses the `continue` statement to skip the rest of the step if `i` is `0`. There are three different controls that give you flexibility in the loop.

- `continue`: Skip the rest of the current step and move on to the next.
- `break`: Stop the entire loop and move on to the next lines of code after the loop
- `pass`: Do nothing.

In [14]:
#: Use continue to skip all 'y's
test_string = 'Bryce Canyon'
for letter in test_string:
    if letter == 'y':
        continue
    print(letter)

B
r
c
e
 
C
a
n
o
n


In [15]:
#: Use break to stop once it encounters a space
test_string = 'Bryce Canyon'
for letter in test_string:
    if letter == ' ':
        break
    print(letter)

B
r
y
c
e


In [16]:
#: pass just does nothing and moves on to the next:
test_string = 'Bryce Canyon'
for letter in test_string:
    if letter == 'y':
        pass
    print(letter)

B
r
y
c
e
 
C
a
n
y
o
n


In [17]:
#: You can use pass for the body of a function you know you'll need but haven't written yet
def step_one():
    pass

def step_two():
    print('Hi mom!')
    
step_one()
step_two()

Hi mom!
