# Strings

Strings can be thought of as a sequence of characters. We've already seen `len()` used, but let's see it again.

In [1]:
len("Hello there")

11

In [2]:
len("")

0

In [3]:
len("""This is a multi-line
string. Pretty cool!""")

41

You can reference elements in sequences. You use square brackets and the _index_ of the element in the sequence. Here's some examples:

In [5]:
greeting = "Hello there!"

In [6]:
greeting[0]

'H'

In [7]:
greeting[1]

'e'

In [8]:
# Counts from the right-hand side.
greeting[-1]

'!'

In [9]:
greeting[-3]

'r'

In [10]:
idx = 0

while idx < len(greeting):
    print(idx, greeting[idx])
    idx += 1

0 H
1 e
2 l
3 l
4 o
5  
6 t
7 h
8 e
9 r
10 e
11 !


In [11]:
for character in greeting:
    print(character)

H
e
l
l
o
 
t
h
e
r
e
!


In [12]:
for n in range(5):
    print(n)

0
1
2
3
4


You can also reference subsequences of a sequence. You use square brackets like before, but you put the starting index, a colon, and the ending index (non-inclusive -- that is, the element at the ending index isn't in the subsequence.)

In [13]:
greeting[0:5]

'Hello'

In [14]:
greeting[6:11]

'there'

In [15]:
greeting[4:7]

'o t'

In [16]:
# Negative numbers work, too!
greeting[6:-1]

'there'

In [17]:
greeting[-3:-1]

're'

You can leave off one of the numbers if you want to start at the beginning or go to the end of the sequence.

In [18]:
greeting[:5]

'Hello'

In [19]:
greeting[6:]

'there!'

In [20]:
# What happens if you leave both off?
greeting[:]

'Hello there!'

In [21]:
greeting[0:12:2]

'Hlotee'

In [22]:
greeting[::-1]

'!ereht olleH'

# Lists

Strings are neat, but what if we want a sequence of other stuff, like a list of students in a class?

In [23]:
students = ['Dakota', 'Allison', 'Taylor', 'Remy', 'Parker',]

In [24]:
len(students)

5

In [25]:
students[0]

'Dakota'

In [26]:
students[1]

'Allison'

In [27]:
students[1:4]

['Allison', 'Taylor', 'Remy']

This is a list, and you can use it like any other sequence. All sequences have [common operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations) you can use with them.

In [28]:
# Test for inclusion.
"Taylor" in students

True

In [29]:
"Carter" in students

False

In [30]:
# Concatenation
students + ["Carter", "Peyton"]

['Dakota', 'Allison', 'Taylor', 'Remy', 'Parker', 'Carter', 'Peyton']

In [31]:
# This is kind of weird and might make more sense with numbers.
print(min(students))
print(max(students))

Allison
Taylor


In [32]:
min([9, -2, 19, -6, 4])

-6

In [33]:
sum([9, -2, 19, -6, 4])

24

In [34]:
students.index("Remy")

3

In [35]:
students.index("Emory")

ValueError: 'Emory' is not in list

In [36]:
students.count("Remy")

1

`count` wasn't that useful, but I bet it would be in a string.

In [37]:
sentence = """Tentative and then with more determination, 
    you started your own investigation."""

sentence.count("e")

8

In [38]:
min([12, "hello", False])

TypeError: '<' not supported between instances of 'str' and 'int'

Lists have [a lot more things](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types) they can do.

# An aside about objects

Before now, everything we saw were functions. They took arguments and returned values. Now we have this new syntax: `sentence.count("e")`.

In Python, everything is an _object_, which means it not only is a value, but it also has defined behavior. That behavior is contained in _methods_, which are like functions, but are called on specific objects. We will see this a lot more and learn much more about it later. For now, just memorize whether something is a function or method.

If you wonder why you wouldn't do everything the same way and have `sentence.len()` instead of `len(sentence)`, or maybe `count(sentence, "e")` instead of `sentence.count("e")`, I'm with you.

In [39]:
# :(
sentence.len()

AttributeError: 'str' object has no attribute 'len'

# For loops

One thing you will need to do in programming very often is to iterate over the members of a sequence and do something with them.

In [40]:
for student in students:
    print(student + " is a great student.")

Dakota is a great student.
Allison is a great student.
Taylor is a great student.
Remy is a great student.
Parker is a great student.


In [41]:
for number in [1, 2, 3, 4, 5]:
    print(number ** 2)

1
4
9
16
25


How can you use a for loop to do stuff besides printing? What if you wanted to make a new sequence?

In [51]:
sentence = "Making plots and visualizations is one of the most important tasks in data analysis."
all_letters = "abcdefghijklmnopqrstuvwxyz"
found_letters = []
for char in sentence.lower():
    if char in all_letters and char not in found_letters:
        found_letters.append(char)
        
print(found_letters)

['m', 'a', 'k', 'i', 'n', 'g', 'p', 'l', 'o', 't', 's', 'd', 'v', 'u', 'z', 'e', 'f', 'h', 'r', 'y']


In [44]:
print(sorted(found_letters))

['a', 'd', 'e', 'f', 'g', 'h', 'i', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 't', 'u', 'v', 'y', 'z']


In [52]:
numbers = [1, 2, 3, 4, 5, 6,]
squares = []

for number in numbers:
    squares.append(number ** 2)
    
print(squares)

[1, 4, 9, 16, 25, 36]


# Tuples

Tuples are a lot like lists, but are immutable, unlike lists. This means they cannot be changed after they are created. There's lot of good reasons for that, but one of the ones you'll see immediately is when you want to have a _record_ -- that is, a collection of data that is similar across a whole set. Take coordinates, for instance:

In [53]:
def sqrt(number):
    return number ** 0.5

def distance(pos1, pos2):
    """Calculates the length of a straight line drawn from one coordinate to another."""
    
    adjacent = pos1[0] - pos2[0]
    opposite = pos1[1] - pos2[1]
    hypotenuse = sqrt(adjacent ** 2 + opposite ** 2)
    return hypotenuse

distance((4, 5), (1, 9))


5.0

In [55]:
x, y = (4, 5)
print(x, y)

4 5


Parentheses are used for multiple things in Python, so if you are using a tuple with one element, remember to put in a comma.

In [56]:
(1 + 2) * (3 + 4)

21

In [57]:
(1)

1

In [58]:
(1,)

(1,)

# Ranges

Ranges are yet another sequence type. They're great any time you need a series of numbers.

In [59]:
range(5)

range(0, 5)

In [60]:
list(range(5))

[0, 1, 2, 3, 4]

In [61]:
list(range(10, 15))

[10, 11, 12, 13, 14]

In [62]:
list(range(1, 20, 2))

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

In [63]:
# What's the sum of all odd numbers from 1 to 1000?
total = 0
for num in range(1, 1000, 2):
    total += num

total

250000

In [64]:
sum(range(1, 1000, 2))

250000