# Strings

Strings can be thought of as a sequence of characters. We've already seen `len()` used, but let's see it again.

In [82]:
len("Hello there")

11

In [83]:
len("")

0

In [84]:
len("""This is a multi-line
string. Pretty cool!""")

41

You can reference elements in sequences. You use square brackets and the _index_ of the element in the sequence. Here's some examples:

In [85]:
greeting = "Hello there!"

In [86]:
greeting[0]

'H'

In [87]:
greeting[1]

'e'

In [88]:
# Counts from the right-hand side.
greeting[-1]

'!'

In [89]:
greeting[-3]

'r'

In [90]:
idx = 0

while idx < len(greeting):
    print(greeting[idx])
    idx += 1

H
e
l
l
o
 
t
h
e
r
e
!


In [91]:
for character in greeting:
    print(character)

H
e
l
l
o
 
t
h
e
r
e
!


In [92]:
for n in range(5):
    print(n)

0
1
2
3
4


You can also reference subsequences of a sequence. You use square brackets like before, but you put the starting index, a colon, and the ending index (non-inclusive -- that is, the element at the ending index isn't in the subsequence.)

`[x:y]` gives us the items in the sequence from index `x` up to but not including index `y`.

In [93]:
greeting[0:5]

'Hello'

In [94]:
greeting[6:11]

'there'

In [95]:
greeting[4:7]

'o t'

In [96]:
# Negative numbers work, too!
greeting[6:-1]

'there'

In [97]:
greeting[-3:-1]

're'

You can leave off one of the numbers if you want to start at the beginning or go to the end of the sequence.

In [98]:
greeting[:5] # start at the beginning

'Hello'

In [99]:
greeting[6:] # start at the end

'there!'

In [100]:
# What happens if you leave both off?
greeting[:]

'Hello there!'

In [101]:
greeting[0:12:2]

'Hlotee'

In [102]:
greeting[::-1]

'!ereht olleH'

# Lists

Strings are neat, but what if we want a sequence of other stuff, like a list of students in a class?

In [103]:
students = ['Dakota', 'Allison', 'Taylor', 'Remy', 'Parker']

In [104]:
len(students)

5

In [105]:
students[0]

'Dakota'

In [106]:
students[1]

'Allison'

In [107]:
students[1:4]

['Allison', 'Taylor', 'Remy']

What we call an array in JavaScript is a **list** in Python. You can use it like any other sequence. All sequences have [common operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations) you can use with them.

In [108]:
# Test for inclusion.
"Taylor" in students

True

In [109]:
"Carter" in students

False

In [110]:
"Dana" not in students

True

In [111]:
# Concatenation -- adding things to a list without modifying the original list
new_students_list = students + ["Carter", "Peyton"]
print(new_students_list)
print(students)

['Dakota', 'Allison', 'Taylor', 'Remy', 'Parker', 'Carter', 'Peyton']
['Dakota', 'Allison', 'Taylor', 'Remy', 'Parker']


In [112]:
students + ['Alex']
print(students)
# reassign the value to change the original list
students += ['Alex']

['Dakota', 'Allison', 'Taylor', 'Remy', 'Parker']


In [113]:
# Append a single item to a list -- modifies the original list
students.append("Sam")
print(students)

['Dakota', 'Allison', 'Taylor', 'Remy', 'Parker', 'Alex', 'Sam']


In [114]:
students.extend(["Avery", "Ola"])
print(students)

['Dakota', 'Allison', 'Taylor', 'Remy', 'Parker', 'Alex', 'Sam', 'Avery', 'Ola']


In [115]:
# This is kind of weird and might make more sense with numbers.
print(min(students))
print(max(students))

Alex
Taylor


In [116]:
min([9, -2, 19, -6, 4])

-6

In [117]:
sum([9, -2, 19, -6, 4])

24

In [118]:
students.index("Remy")

3

In [119]:
students.index("Emory")

ValueError: 'Emory' is not in list

In [120]:
students.count("Remy")

1

`count` wasn't that useful, but I bet it would be in a string.

In [121]:
sentence = """Tentative and then with more determination, 
    you started your own investigation."""

sentence.count("e")

8

Lists have [a lot more things](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types) they can do.

# An aside about objects

Before now, everything we saw were functions. They took arguments and returned values. Now we have this new syntax: `sentence.count("e")`.

In Python, everything is an _object_, which means it not only is a value, but it also has defined behavior. That behavior is contained in **methods**, which are like functions, but are called on specific objects. We will see this a lot more and learn much more about it later.

If you wonder why you wouldn't do everything the same way and have `sentence.len()` instead of `len(sentence)`, or maybe `count(sentence, "e")` instead of `sentence.count("e")`, I'm with you.

In [122]:
# 😞
sentence.len()

AttributeError: 'str' object has no attribute 'len'

# For loops

One thing you will need to do in programming very often is to iterate over the members of a sequence and do something with them.

In [123]:
for student in students:
    print(student + " is a great student.")

Dakota is a great student.
Allison is a great student.
Taylor is a great student.
Remy is a great student.
Parker is a great student.
Alex is a great student.
Sam is a great student.
Avery is a great student.
Ola is a great student.


In [124]:
for number in [1, 2, 3, 4, 5]:
    print(number ** 2)

1
4
9
16
25


How can you use a for loop to do stuff besides printing? What if you wanted to make a new sequence?

In [125]:
sentence = "Making plots and visualizations is one of the most important tasks in data analysis."
all_letters = "abcdefghijklmnopqrstuvwxyz"
found_letters = []
for letter in sentence.lower():
    if letter in all_letters and letter not in found_letters:
        found_letters.append(letter)
        
print(found_letters)

['m', 'a', 'k', 'i', 'n', 'g', 'p', 'l', 'o', 't', 's', 'd', 'v', 'u', 'z', 'e', 'f', 'h', 'r', 'y']


In [126]:
print(sorted(found_letters))

['a', 'd', 'e', 'f', 'g', 'h', 'i', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 't', 'u', 'v', 'y', 'z']


In [127]:
numbers = [1, 2, 3, 4, 5, 6,]
squares = []

for number in numbers:
    squares.append(number ** 2)
    
print(squares)

[1, 4, 9, 16, 25, 36]


# Tuples

Tuples are a lot like lists, but are immutable, unlike lists. This means they cannot be changed after they are created.

In [128]:
dimensions = (200, 400)
dimensions[0] = 50 # This is not ok!

TypeError: 'tuple' object does not support item assignment

One reason you might want to make sure that your data cannot change: when you want to have a _record_ -- that is, a collection of data that is similar across a whole set. Take coordinates, for instance:

In [129]:
def sqrt(number):
    return number ** 0.5

def distance(pos1, pos2):
    """Calculates the length of a straight line drawn from one coordinate to another."""
    
    adjacent = pos1[0] - pos2[0]
    opposite = pos1[1] - pos2[1]
    hypotenuse = sqrt(adjacent ** 2 + opposite ** 2)
    return hypotenuse

distance((4, 5), (1, 9))


5.0

In [130]:
x, y = (4, 5)
print(x, y)

4 5


Parentheses are used for multiple things in Python, so if you are using a tuple with one element, remember to put in a comma.

In [131]:
(1 + 2) * (3 + 4)

21

In [132]:
(1)

1

In [133]:
(1,)

(1,)

# Ranges

Ranges are yet another sequence type. They're great any time you need a series of numbers.

In [134]:
range(5)

range(0, 5)

In [135]:
list(range(5))

[0, 1, 2, 3, 4]

In [136]:
list(range(10, 15))

[10, 11, 12, 13, 14]

In [137]:
list(range(1, 20, 2))

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

In [138]:
# What's the sum of all odd numbers from 1 to 1000?
total = 0
for num in range(1, 1000, 2):
    total += num

total

250000

In [139]:
sum(range(1, 1000, 2))

250000