# Python Basics Week 2

Week 2 topics: 
- String data
- Indexing & Slicing strings
- f strings
- string operations
- the len() function
- membership operators & Boolean expression

## What is a string?

Strings are sequences of Unicode characters. Strings are how we represent textual data in Python, among other things.

Strings are encased inside of double or single quotation marks. You can use either one, so long as they match.

In [22]:
"Double quote marks can contain 'single quote marks' without issues"

"Double quote marks can contain 'single quote marks' without issues"

In [23]:
'Likewise, single quotes can contain "double quotes" without issues'

'Likewise, single quotes can contain "double quotes" without issues'

In [24]:
"but we cannot put "double quotes" inside of other double quotes"

SyntaxError: invalid syntax (1653156654.py, line 1)

In [25]:
'or 'single quotes' inside of single quotes'

SyntaxError: invalid syntax (434174452.py, line 1)

To make a string that spans multiple lines, use triple quotation marks.

In [20]:
long_string = """I am a very
very
very
long string
"""
print(long_string)

I am a very
very
very
long string



Strings are immutable. You cannot change a string after it has been created, but you can make *new* strings based on existing ones. We'll see how to do this later.

## Indexing

You can think of an index as a way to represent a character's position in a string.

Python indexes starting at zero, so the first item or character has an index value of zero. (Not to make things more complicated, but the index is actually calling the position *between* the characters)

In [None]:
#string: | P | Y | T | H | O | N |
#index:  0   1   2   3   4   5   6

We can call a specific character in a string using its index:

In [26]:
word = "python"
word[0]

'p'

In [35]:
word[4]

'o'

In [36]:
# What happens if I call an index that doesn't have a character associated with it?
word[6]

IndexError: string index out of range

### Slicing

If we want to call out a segment/range/sub-sequence of characters, we call that a "slice":

In [41]:
word[2:5]  # "give me every character between position 2 and position 5

'tho'

In [47]:
#If you want to start at zero, you can specify that, but you don't have to:
print(word[0:2])  # return everything between position zero and position 2
print(word[:2])  # return everything between the first position (which is index zero) and position 2

py
py


In [48]:
# You can also slice to the end of a string:
print(word[4:])  # return everything between position 4 and the end of the string

on


In [49]:
# So what will be returned if I call this slice?
print(word[:])

python


In [93]:
# I can add a third argument to a slice to specify a step argument or skip-by value:
word[0:6:2]

'bcc'

In [67]:
# Also the same as:
word[::2]

'pto'

## Formatted Strings

We can embed variable values within a string by using an `f'` string. Curly brackets are used around the name of any variable that you wish to include in the string. This is my (Shelby's) preferred way to write print statements.

In [96]:
name = "Shelby Watson"

# without f string:
print("Hello, ", name, "! How are you today?", sep='')  # have to change the separator (by default, whitespace) to account for the exclamation point

#with f string:
print(f"Hello, {name}! How are you today?")

Hello, Shelby Watson! How are you today?
Hello, Shelby Watson! How are you today?


In [50]:
# Using an f string with an index:
print(f'Hello, {name[0]}, how are you today?')

Hello, S, how are you today?


In [39]:
# Using an f string with a slice:
print(f"Hello {name[:6]}, how are you today?")

Hello Shelby, how are you today?


## String Operations & Methods

### String Operations

You can join strings together with the concatination operator `+`:

In [1]:
name = "Shelby" + " " + "Watson"
print(name)

Shelby Watson


Similarly, the `*` operator, when used on a string, does repetition:

In [None]:
greeting = "Hello! "
greeting*5

### String Methods

A method is similar to a function but has slightly different syntax. The dot notation specifies the method and the string to apply the method to. If the method takes arguments, those go in the parentheses. If the method takes no argument, leave the parentheses empty. 

In [None]:
word = "Python"

In [None]:
word.upper()

In [None]:
word.lower()

*n.b.* String methods return new values - they do not change the original string.

In [None]:
print(word)

In [6]:
sentance = "what a lovely day to learn to code."
sentance.capitalize()

'What a lovely day to learn to code.'

In [7]:
#vs
sentance.title()

'What A Lovely Day To Learn To Code.'

In [8]:
sentance.split()  # Splits on whitespace by default. Returns a list, which we will learn more about next week!

['what', 'a', 'lovely', 'day', 'to', 'learn', 'to', 'code.']

In [4]:
sentance.split("l")  # if you want to split on something other than whitespace, enter it as an argument

['what a ', 'ove', 'y day to ', 'earn to code.']

In [10]:
nums = "1-8-9-4-6-2-4-5-6-2-1-9"
nums.split("-")  # you can change what you want to split on by specifying it as an argument.

['1', '8', '9', '4', '6', '2', '4', '5', '6', '2', '1', '9']

In [13]:
# now put it back together
nums_list = nums.split("-")
print(nums_list)

nums_string = "-".join(nums_list)  #specify the separator first, then specify the list to join into a string
print(nums_string)

['1', '8', '9', '4', '6', '2', '4', '5', '6', '2', '1', '9']
1-8-9-4-6-2-4-5-6-2-1-9


In [15]:
phrase = "hello world!"
phrase.replace("h","j")

'jello world!'

In [51]:
# But remember, this doesn't change the actual string:
print(phrase)

hello world!


In [53]:
# If we wanted to keep it permanently, we have to assign it to its own variable:
jello_phrase = phrase.replace("h","j")
print(jello_phrase)

jello world!


## The Len Function

Len (or "length") is a built-in function that returns the number of characters in a string:

In [None]:
len("banana")

In [97]:
# Pay attention:
fruit = "strawberry"

print(len(fruit))

print(len("fruit"))

10
5


## Membership Operators & Boolean Expressions

The membership operator checks whether a value is present. The operators are `"in"` (and `not in`). They return a Boolean (True or False) value. You can also use Boolean expressions (`and`, `or`, and `not`) to add certain conditions to your statements.

In [54]:
"i" in "team"

False

In [83]:
"m" in "team" and "e" in "team"

True

In [91]:
# Scrabble Scores
# Return "True" if the word contains at least one high-scoring (10 point) Scrabble letter
word = input("What is your word?")
"q" in word.lower() or "z" in word.lower()

What is your word? bicycle


False

You can imagine that if we had many different conditions we wanted to check, like all the scrabble letters worth more than four points, that this could get very tedious. Next week, we'll look at creating lists in Python and traversing them using loops.

## (If time) Libraries

Libraries/packages are collections of functions and/or data that help expand on Python's core functions. There are tons of Python packages available. As you continue learning (and read more code!) you will start to become familiar with some of the more common libraries. Libraries can also help you accomplish very specific tasks - for example, conducting advanced statistical analysis, reading genomic data, making plots and charts, etc. without having to write a lot of custom functions.

### Installing Libraries

If you installed Python through Anaconda, you already have most commonly used Python libraries installed on your computer. However, if you need to install a new library that you don't already have, you will use pip. You should only have to install a library on your computer once (unless it needs to be updated).

Starting a command with an exclamation point ( ! ) passes the command to the shell. You can also use the same syntax (minus the "!" ) in the command line/terminal/Bash shell to install rather than doing it through your IDE.

In [None]:
!pip install numpy

(you will already have the numpy library installed on your computer through Anaconda)

Doing this installs the library on your computer, but it's not actually available in your notebook yet. In order to make it available in our notebook, we need to import the library. You only need to import a package once per notebook. By convention, importing the packages required for your code should be the very first thing you see when you open a script. This lets other people using your code know what they need to install and makes all the functions you need available up front.

In [None]:
import random

In order to save on the amount of typing that you have to do, it is convention to import some Python packages with "aliases" or shortened names. This is done with the `as` keyword. The preferred alias is usually included in the package's documentation. A few common ones are:

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt