# Week 2 Lab Task
This week is about getting started with powerful tools that will underlie many of the skills you learn in the course. Much of the effort is in setting up your programming environment: the lab questions will ensure that it is done correctly and help you grow familiar with it.

In this course we'll be using the Python programming language, using an innovative environment called Jupyter Notebooks.

Your _environment_ is similar to your local workspace. Look at your desk: how do you organize your pens, paper, mouse, monitor? Or maybe you have a barebones workspace, working at a coffee shop or kitchen table with only a cup of coffee. In the same way, you can have many different environments for how you work with Python: working on a command line, or running scripts. Jupyter Notebooks is an environment that gives you an interactive, browser based version of Python. It allows you to play with code in a way that gives you immediate feedback, and allows you to break, tinker, and retry.

Jupyter Notebooks will be installed through Anaconda.

When programming, you're usually not writing everything from scratch. Some code is needed by many other people, so most languages have a concept of a _library_: code written and distributed by other people that you can easily use in your own work. 

Anaconda is a scientific distribution of Python, which installs Python on your system alongside a great deal of libraries that scientists use. To be clear: it is possible to install Python in other ways and individually install the libraries, but Anaconda puts it all into a tidy package. As scientists want complicated mathematical algorithms, installing some scientific libraries can be very difficult: Anaconda makes it easy!

## 1. Installing Jupyter Notebooks through Anaconda

Install Jupyter Notebooks following the instructions in the Art of Literary Text Analysis, following the [Getting Setup](https://github.com/sgsinclair/alta/blob/master/ipynb/GettingSetup.ipynb) and [Getting Started](https://github.com/sgsinclair/alta/blob/master/ipynb/GettingStarted.ipynb) (you can stop before the Printing Dynamic Content section). Make sure you install the Python 3 version. Because this is our first introduction to ALTA, it's worth reading the [short introductory text](https://github.com/sgsinclair/alta/blob/master/ipynb/ArtOfLiteraryTextAnalysis.ipynb). If you have trouble with installation, start a discussion in the Open Discussion forum.

After you're done installation, start a new notebook and follow along with the tour at Help > User Interface Tour.

_Questions_

- 1) What are the two modes of a notebook?
- 2) What do you press to leave edit mode while in a cell?
- 3) What are the Keyboard Shortcuts for:
 - a) insert cell below
 - b) insert cell above
 - c) run selected cells

## 2. A Little bit of code

Create a new cell in your notebook with the '+' button in the toolbar (or one of the keyboard shortcuts from the previous question). We're going to try two simple Python commands: setting a variable, and splitting it by whitespace. In the process, we'll encounter two types of data that Python can hold: a string, and a list.

Add the following code to the cell and 'run' it. If it runs properly, it should look like below, with the 'In' and 'Out' information.

In [None]:
sentence = "Hello world."
sentence

'Hello world.'

Here, we set a string to a variable, then we called that variable.

_Questions_
- 4) What output is there if you run the cell without the second line (which simply says `sentence`)?

A string is a type of data in Python. By setting it to the variable `sentence`, everywhere you use `sentence` is the exact same as simply writing `"Hello world."` Consider the following examples, or even try them out, which show that the way of joining two strings works the same with a variable or directly with a string:

In [None]:
"Hello world." + " Hello moon."

'Hello world. Hello moon.'

In [None]:
sentence + " Hello moon."

'Hello world. Hello moon.'

In [None]:
sentence + sentence

'Hello world.Hello world.'

We can even see the datatype of a variable with `type()`:

In [None]:
type(sentence)

str

If you have a really long string that needs to go across lines, you can use `\` before the line break to tell Python that _this line of code is not done yet_. Set this famously long sentence from _Paul Clifton_ to the variable `paragraph` in your notebook:

In [None]:
paragraph = "It was a dark and stormy night; the rain fell in torrents — except at occasional intervals, when it was " + \
    "checked by a violent gust of wind which swept up the streets (for it is in London that our scene lies), rattling " + \
    "along the housetops, and fiercely agitating the scanty flame of the lamps that struggled against the darkness."
paragraph

'It was a dark and stormy night; the rain fell in torrents — except at occasional intervals, when it was checked by a violent gust of wind which swept up the streets (for it is in London that our scene lies), rattling along the housetops, and fiercely agitating the scanty flame of the lamps that struggled against the darkness.'

_Questions_ 
- 5) For the code block above, 
  - a) Are the indents necessary for the code to run?
  - b) Are the pluses (+) necessary for the code to run?
  - c) Are the backslashes (\\) necessary for the code to run?
  
_tinker with the code and re-run as necessary_

Another important datatype in Python is the `list`. This is a way to hold multiple things together: strings, numbers, etc. For example:

In [None]:
list_of_strings = ["Never", "gonna", "give", "you", "up"]
list_of_strings

['Never', 'gonna', 'give', 'you', 'up']

In [None]:
list_of_numbers = [ 4, 8, 15, 16, 23, 42]
list_of_numbers

[4, 8, 15, 16, 23, 42]

Individual objects from a list can be retrieved using a square bracket referencing the place in the list (starting with 0):

In [None]:
list_of_strings[0]

'Hello'

In [None]:
list_of_numbers[1]

8

You can select a list range by specify two numbers in the square brackets with a colon in-between:

In [None]:
list_of_strings[1:4]

['gonna', 'give', 'you']

Using the colon without a number means _from the very start_ or _until the very end_:

In [None]:
list_of_strings[:4]

['Never', 'gonna', 'give', 'you']

In [None]:
list_of_strings[1:]

['gonna', 'give', 'you', 'up']

You can add to a list with `list.append()`:

In [None]:
list_of_strings.append("Word")
list_of_strings

['Hello', 'world', 'Word', 'Word']

_Questions_

- 6) Can a list have a mix of numbers and strings?
- 7) We joined strings with '+'. What happens if you try to use '+' on two lists?

# 3. Splitting a string to a list

A string can be split into a list using a splitting character. In the (useless) example below, we tell Python that everywhere there is an 'o' should be considered a place to split the string into a list:

In [None]:
sentence.split("o")

['Hell', ' w', 'rld.']

This can be used for a simple word tokenization by space characters:

In [None]:
words = sentence.split(" ")
words

['Hello', 'world.']

_Questions:_

 - 8) How would you select a list with the first seven words in the `paragraph` variable? This will require two steps. Show your code and the output.
 - 9) The opposite of `split` is possible with `"string_to_join_list_items_by".join(your_list)`. Set the list from question 8 to a variable and join it into a single string. The output will be 'It was a dark and stormy night;': write your code.
 - 10) Split the following text into a list of *sentences*. Don't worry if one of your sentences is an empty string (''). Show the code and output.
     > The shows opens at Duckburg. After Donald Duck enlists in the navy, Uncle Scrooge has to take care of grand-nephews Huey, Dewey, and Louie. Uncle Scrooge brings the boys to the McDuck's mansion where they are presented to Duckworth, the butler. The nephews are forced to sleep in the attic.