# Strings, Arrays, and `for` Loops

In this notebook we'll cover the basis of strings, arrays, and `for` loops in Python.

 1. First we'll introduce strings and string indexing.
 2. Next we'll introduce lists and tuples--built-in datatypes which are a little bit different in Python than in some other languages.
 3. Next we'll introduce NumPy arrays and matrices--datatypes designed with numerical work in mind.
 4. Finally, we'll get a little bit familiar with `for` loops.

## Strings

A string in Python and other programming languages is a string of individual characters. It can be declared using `'` single quotes or `"` double quotes.

It can also be declared using triple quotes `'''` `"""` of either variety, which allows for line breaks.

Strings in Python can be concatenated using a simple `+` operator.

In [1]:
a_string = 'Hi there.'
another_string = "How are you?"

print(a_string + ' ' + another_string)

Hi there. How are you?


### String indexing

Individual elements of strings can be referenced through indexing. 

Indexes in Python start from 0: 0 is the first element. The last element in an $N$-length string has index $N - 1$.

A range of values can be referenced by giving two indexes and placing a `:` colon between them.

The range `a:b` will return all elements from `a` to `b-1`.

Python also allows indexing with negative numbers, which pulls from the end of the string. -1 refers to the last element, -2 refers to the first-to-last, etc.

In [2]:
print(a_string[0])
print(a_string[1])
print(a_string[-1])

print(a_string[0:3] + another_string[-4:-1] + '!')


H
i
.
Hi you!


### Quick exercises for strings

 1. Define some string variables yourself. Use indexing and the `+` operator to make different combinations of them and output to the console with `print()`.

## Lists and tuples

Apart from strings, Python natively has two datatypes that group values together: lists and tuples. The two function almost exactly the same, with one key difference: lists can be modified "on the fly" after definition, while tuples are *immutable*--their individual elements cannot be modified after the fact.

For most applications (and most programmers), the additional restriction placed on tuples is irrelevant and pesky. In this lesson, therefore, we'll only use lists.

Lists are declared using square brackets `[`. Tuples are declared using round parentheses `(`


### List indexing

Indexing for lists (and tuples) in Python follows the same rules as indexing for strings. Python lists can also be appended to one another using a `+` operator, just like strings.

(One way to think about Python strings is that they are just special lists that can contain only characters and have some special text-specific functionality.)



In [4]:
a_list = ['a','b','c']

a_tuple = ('a','b','c')

equivalent_string = 'abc'

print(a_list[0],a_tuple[0],equivalent_string[0])

print(a_list + a_list)

a a a
['a', 'b', 'c', 'a', 'b', 'c']


In [5]:
## We can modify individual elements of a list after the fact
a_list[1] = 'B'

print(a_list)

## If we try to do this with a tuple, we get an error instead. 
## Most of the time, this is just an unnecessary hassle! So tuples are used a lot less.
a_tuple[1] = 'B'

print(a_tuple)

['a', 'B', 'c']


TypeError: 'tuple' object does not support item assignment



### List contents

A list in Python is extremely flexible. It can hold any sequence of any length of elements of any type.

You can define a list of only numbers, and this will often be useful. But there is, in principle, nothing stopping you from forming a list of one number, to strings, then another number. You can even form a list of lists.


In [6]:
## Lists can contain elements of any type
another_list = [1,2,'abc','e',34]
print(another_list + a_list)

## Lists of lists are also allowed, and often useful
a_list_of_lists = [another_list,a_list,another_list]
print(a_list_of_lists)

[1, 2, 'abc', 'e', 34, 'a', 'B', 'c']
[[1, 2, 'abc', 'e', 34], ['a', 'B', 'c'], [1, 2, 'abc', 'e', 34]]


### Numerical lists

A single list of numbers can function in many ways as a "vector" in math. A list of equal-length lists of numbers can function in many ways as a "matrix" in math.

There are some built-in Python functions that perform convenient operations on a numerical list. For example, the `sum()` function will add up all the elements.

In other ways, a list of numbers in Python is just a list. For example, the `+` operator will not add the elements--it just appends the lists.

Later in this notebook we'll also cover NumPy arrays and matrices, which have additional structure/restrictions and additional functionality to behave like mathematical vectors and matrices.

In [7]:
## This list can be used like a 4-element numerical vector
list_of_floats = [1.,2.,8.,9.]
print(list_of_floats)

## This list of lists is a bit like a 3x4 numerical matrix
pseudo_array_of_floats = [list_of_floats,list_of_floats,list_of_floats]
print(pseudo_array_of_floats)

[1.0, 2.0, 8.0, 9.0]
[[1.0, 2.0, 8.0, 9.0], [1.0, 2.0, 8.0, 9.0], [1.0, 2.0, 8.0, 9.0]]


In [8]:
## sum() function adds up elements in a list, if they are all add-able
print(sum(list_of_floats))

## A list of floats is **still a list**. The `+` operator will not add the elements, it will just append the lists.
print(list_of_floats + list_of_floats)

20.0
[1.0, 2.0, 8.0, 9.0, 1.0, 2.0, 8.0, 9.0]


### Quick exercises for lists and tuples

 1. Define a list of only numbers.
 2. Apply the `sum()`, `max()`, and `min()` functions to your number-only list. Are the results what you expect?
 3. Define a list of only strings.
 4. Apply the `sum()`, `max()`, and `min()` functions to your string-only list. Are the results what you expect?
 5. Define a mixed-type list of some numbers and some strings.
 6. Apply the `sum()`, `max()`, and `min()` functions to your mixed list. Are the results what you expect?

## NumPy Arrays and Matrices

NumPy Arrays and Matrices are designed to function as mathematical vectors and matrices.

They generally must contain elements of only one type.

Arrays and matrices of non-numerical datatypes can be constructed but are not usually useful and are rarely used.

### Declaring NumPy arrays/matrices

You can declare an array or a matrix by feeding a list or a congruent list-of-lists into the `numpy.array()` or `numpy.matrix()` functions.

### Array vs. matrix

The main difference between an array and a matrix is in how they handle multiplication by default.

If you multiply two arrays with a `*` operator, the default is to perform element-wise multiplication.

If you multiply two matrices with a `*` operator, the default is to perform matrix multiplication.

In [17]:
import numpy as np

an_array = np.array([[1,2,3],[4,5,6],[7,8,9]])

a_matrix = np.matrix([[1,2,3],[4,5,6],[7,8,9]])


print(an_array*an_array)

print(a_matrix*a_matrix)

[[ 1  4  9]
 [16 25 36]
 [49 64 81]]
[[ 30  36  42]
 [ 66  81  96]
 [102 126 150]]


### Numerical operations with NumPy arrays vs. Python lists

NumPy arrays are designed to be convenient for many common types of numerical operations.

For example, adding a scalar to a NumPy array with the `+` operator adds that scalar to each element in the array.

If you try to do the same with a list, you get an error. `+` for lists is only for appending.

There is a way to do the element-wise addition operation with lists, but it requires us to first cover the topic of loops.

In [18]:
## Define a list of integers
equivalent_list = [1,2,3,4,5,6,7,8,9]

## Make a NumPy array from the list
a_onedimensional_array = np.array(equivalent_list)

## It's simple to add a scalar to each element of our NumPy array
print(a_onedimensional_array)
print(a_onedimensional_array + 5)

## Not so simple with the list. We get an error!
## There's a way to add a scalar to each list element, but it requires covering loops first
print(equivalent_list + 5)


[1 2 3 4 5 6 7 8 9]
[ 6  7  8  9 10 11 12 13 14]


TypeError: can only concatenate list (not "int") to list

### Quick exercises for NumPy arrays

 1. Define a an array with 1 row and 3 columns.
 2. Multiply this array by itself. Is the result what you expect?
 3. Multiply this array by a constant. Is the result what you expect?
 4. Convert your array to a matrix using the `numpy.matrix()` function.
 5. Multiply this matrix by itself. Is the result what you expect?
 6. Define a new matrix which is the transpose of your original matrix, using either the `numpy.transpose()` function or the `.T` method.
 7. Multiply your original matrix by its transpose. Is the result what you expect?
 8. Define two new arrays: one with 1 row and 3 columns, and another with 1 row and 4 columns. Multiply them. Is the result what you expect?

### Logical indexing

A convenient type of indexing that can be used with NumPy arrays is "logical indexing." It allows you to select elements of an array according to a vector of Boolean "True/False" values.

For example,

```
a_onedimensional_array[[False,True,True,False,False,True,True,False,True]]
```

...will select the 2nd, 3rd, 6th, 7th, and 9th elements of the array we defined in a previous demonstration codeblock.

The list/array of "True/False" values must be the same shape as the array its being used to index. It can be arbitrary, like the one we just saw, but it is more often convenient to define it according to a condition which is applied to each element of the same array. For example,


```
a_onedimensional_array[a_onedimensional_array > 3]
```

...will give us all the elements in the array which are greater than 3.


In [19]:
## Logical indexing with an arbitrary list of True/False values
print(a_onedimensional_array[[False,True,True,False,False,True,True,False,True]])
print('')

## Logical indexing with a Boolean array generated from a logical test applied to each element of the same array
print(a_onedimensional_array[a_onedimensional_array > 3])


[2 3 6 7 9]

[4 5 6 7 8 9]


### Quick exercises for logical indexing

 1. Define a 1-dimensional array and index it using two different logical conditions.
 2. Define a 2-dimensional array and index it using two different logical conditions.

## `for` Loops

The simplest kind of loop in Python is a `for` loop. It will loop through a sequence of elements pulled from a list or other "iterable" element.

In [20]:
for x in [0,1,2]:
    print(x)

0
1
2


If we want to cycle through a list of consecutive integers, a quick way to get a sequence of all the integers we'll need is to use the built-in `range()` function.

In the cell below, we see three different ways to pull 0,1,2 in sequence: 

  1. by indexing the first 3 elements of a list of integers that we've defined
  2. by calling `range(3)`, which by default starts from 0
  3. by calling `range(0,3)`. Here we've explicitly specified the start point as 0.

In [21]:
list_of_integers = [0,1,2,3,4,5,6,7,8,9]

print('Method 1: first 3 elements of our list')
for x in list_of_integers[0:3]:
    print(x)
print('')

print('Method 2: `range()`, with implicit start point of 0')
for x in range(3):
    print(x)
print('')

print('Method 3: `range()`, with explicit start point of 0')
for x in range(0,3):
    print(x)
print('')



Method 1: first 3 elements of our list
0
1
2

Method 2: `range()`, with implicit start point of 0
0
1
2

Method 3: `range()`, with explicit start point of 0
0
1
2



### Uses of loops

Loops can have many uses. For example, we can output a sequence of powers of 2.

In [22]:
for j in range(16):
    print(2**j)

1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768


### Vectorization

For many types of loop operations, there is an equivalent operation that could be performed across a pre-defined list or array, which is often called the "vectorized" for of the operation.

This can be important because when many thousands or millions of calculations must be performed, vectorized operations in NumPy and in some other languages like Matlab are optimized to be much faster than loop operations.

Below is an example of vectorizing the calculation of powers of 2.

In [23]:
array_of_powers = np.array(range(16))

powers_of_two = 2**array_of_powers

print(powers_of_two)

[    1     2     4     8    16    32    64   128   256   512  1024  2048
  4096  8192 16384 32768]


### Quick exercises for `for` loops

 1. Write a loop that implements some mathematical sequence.
 2. Write a vectorized version of the same loop that uses NumPy arrays.

## Integer ranges

Sometimes it is convenient to construct an array of numbers in a regular sequence. In Python, one convenient way to do this is with the built-in `range()` funciton.

The built-in function `range()` will construct a sequence of integers between two integers.
 - If you provide only one number input, `range(a)` will construct a sequence from `0` to `a-1`.
 - If you provide two numbers, `range(a,b)` will be between `a` and `b-1`.
 - If you provide three numbers, `range(a,b,c)` will be between `a` and `b-1` with step-size `c`.

For efficiency, Python will actually not calculate the elements of a range and store them in memory when you declare it. It will only calculate them and pull them out one at a time as you move through the sequence. This is useful in some situations where you might be pulling numbers from a very long sequence (thousands or millions) but don't need to store them all at once.

If you want to make Python calculate all the elements and store them together, one way to do this is to enter it as an argument to the `numpy.array()` function. Then a NumPy array will be created with the elements of the sequence.

In [24]:
import numpy as np

a_range = range(-1,18,2)

print('until we start pulling out elements, a `range` is just a set of parameters')
print(a_range)
print('')

print('we can pull the elements out with a `for` loop')
for j in a_range:
    print(j)
print('')

print('we can also use a range to define a NumPy array')
print(np.array(a_range))
print('')


until we start pulling out elements, a `range` is just a set of parameters
range(-1, 18, 2)

we can pull the elements out with a `for` loop
-1
1
3
5
7
9
11
13
15
17

we can also use a range to define a NumPy array
[-1  1  3  5  7  9 11 13 15 17]



## Linear Spaces

The `range()` built-in function only takes integers as arguments.

If you wanted a regular sequence where the start- or stop-point, or the step size are not integers, you could create a NumPy array of an integer sequence then multiply all the elements by a constant.

A much easier way to achieve the same thing would be to use the `numpy.linspace()` function. With this function, you specify the start, then end, and the number of elements.

Note that here we are specifying the number of elements instead of the stepsize directly. Sometimes it is more convenient to specify the stepsize, and sometimes it is more convenient to specify the number of elements, but you can always make a calculation to get one from the other if you need to.

In [25]:
print('this generates the same sequence we made with the `range` function above')
a_linspace = np.linspace(-1,17,10)
print(a_linspace)
print('')

print('Need 100 numbers between 0 and 1? Here you go!')
another_linspace = np.linspace(0,1,100)
print(another_linspace)
print('')

this generates the same sequence we made with the `range` function above
[-1.  1.  3.  5.  7.  9. 11. 13. 15. 17.]

Need 100 numbers between 0 and 1? Here you go!
[0.         0.01010101 0.02020202 0.03030303 0.04040404 0.05050505
 0.06060606 0.07070707 0.08080808 0.09090909 0.1010101  0.11111111
 0.12121212 0.13131313 0.14141414 0.15151515 0.16161616 0.17171717
 0.18181818 0.19191919 0.2020202  0.21212121 0.22222222 0.23232323
 0.24242424 0.25252525 0.26262626 0.27272727 0.28282828 0.29292929
 0.3030303  0.31313131 0.32323232 0.33333333 0.34343434 0.35353535
 0.36363636 0.37373737 0.38383838 0.39393939 0.4040404  0.41414141
 0.42424242 0.43434343 0.44444444 0.45454545 0.46464646 0.47474747
 0.48484848 0.49494949 0.50505051 0.51515152 0.52525253 0.53535354
 0.54545455 0.55555556 0.56565657 0.57575758 0.58585859 0.5959596
 0.60606061 0.61616162 0.62626263 0.63636364 0.64646465 0.65656566
 0.66666667 0.67676768 0.68686869 0.6969697  0.70707071 0.71717172
 0.72727273 0.73737374 0.74747475 

### Quick exercises for ranges and linear spaces

Implement the following three sequences both using the `range()` function and using the `numpy.linspace()` function:
 1. First element: -8. Last element: 9. Number of elements: 18
 2. First element: -2. Last element: 2. Number of elements: 1000
 3. First element: 8. Last element: 12. Step size: 2