# Basic Python for Linguists in 10 Minutes

[Jackson Lee](https://jacksonllee.com/)

April 2021

## Introduction

Perhaps you're attending a class or tutorial that says "prior experience in Python programming is helpful but not required" or similar. You may not have a background in computer programming and/or the Python language, but would like to attend and get something out of it nonetheless. Before attending, and if you have 10 minutes to spare, the following topics are the bare minimum that you'd want to understand, at least by reading through the code snippets in this Jupyter notebook (this is the name of this thing you're reading right now - Jupyter is pronounced like the planet in our solar system):

* *Strings*, probably the most important type of Python objects you need to know when handling language data
* *Lists*, to deal with a bunch of things
* *Loops*, to do something repeatedly
* *Conditionals*, to do X instead of Y by some condition

If you're a linguist, picking up a language should be a piece of cake. 🙂

## Strings

To work with language data in Python, understand how **strings** work.

In [1]:
x = "this is a string"

What's just happened? We've assigned the string `"this is a string"` to the variable `x`.

In [2]:
type(x)

str

How many characters are there in the string? The **`len()`** (for "length") function is defined for strings (and many other data types for which the idea of "number of things in it" makes sense).

In [3]:
len(x)

16

Access the first character. Python counts from zero, like many programming languages.

In [4]:
x[0]

't'

You can **slice** a string. Access the characters from index 5 to (but excluding) index 10.
You get back a string whose length is 5, for indices {5, 6, 7, 8, 9}.
(Yes, the space is a character and has a length of one.)

In [5]:
x[5:10]

'is a '

## Lists

We can handle a single string like `x` above. What about multiple strings? Sure, we can define more variables:

In [6]:
y = "this is another string"
z = "this is yet another string"

Okay, you see where this is going. What if we want to handle many more? Definig variables one by one isn't sustainable. We need a way to deal with _a bunch of things_. That's where containers in Python come in. We're going to see how **lists** in Python work below. (There're many other options -- hey, you've got only ten minutes!)

Note how square brackets `[ ]` are used to define a list.

In [7]:
words = ["cats", "dogs"]

In [8]:
type(words)

list

Just like strings, you can ask for the length of a list.

In [9]:
len(words)

2

The ordering of the elements in a list is important.

In [10]:
["cats", "dogs"] == ["dogs", "cats"]

False

Elements in a list can repeat.

In [11]:
more_words = ["cats", "dogs", "dogs"]

In [12]:
more_words

['cats', 'dogs', 'dogs']

In [13]:
len(more_words)

3

Remember the string `x` from above? We can break it up by the spaces to create a list of words.

In [14]:
x

'this is a string'

`.split()` applies to a string.

In [15]:
result = x.split()

In [16]:
result

['this', 'is', 'a', 'string']

In [17]:
type(result)

list

In [18]:
len(result)

4

See how things are getting more interesting now? We've just begun combing knowledge of strings and lists.

A list behaves just like a string does for the slicing syntax with `[0]`, `[5:10]`, etc that we've seen above for strings.

In [19]:
result[0]

'this'

In [20]:
result[-1]  # [-1] gives the final element.

'string'

Slicing a list gives you a list, just as slicing a string gives you a string.

In [21]:
result[:2]  # If the starting index isn't specified, it's 0, i.e., from the beginning.

['this', 'is']

## `for` loops

You've probably heard that computers are very good repeating a task over and over. Let's see how Python **loops** work. We'll have time for **`for`** loops only.

In [22]:
result

['this', 'is', 'a', 'string']

In [23]:
for word in result:
    print(word)
    print(word[0])

this
t
is
i
a
a
string
s


Okay, let's unpack what you've just seen.

* `result` is a list of four strings.
* We iterate over each of them and give it a variable name `word` -- that's the `for word in result:` part.
* With every `word`, we do something: just print it, and then just print the first character of it.

This is a toy example, but the power comes in when you have a lot of elements to iterate over, and when you can control what the computer should do in specific iterations given some condition.

## Conditionals

No, we aren't talking about counterfactuals, etc. Conditionals in programming languages are just about the plain, indicative kind.

In [24]:
if 3 > 2:
    print("hi")

hi


In [25]:
if 3 < 2:
    print("hi")

If condition C is true, do X. If condition C is not true, don't do X. Got it? (For the semanticists / pragmaticists / logicians out there, I know what you have in mind...)

In [26]:
if 3 < 2:
    print("hi")
else:
    print("bye")

bye


This is an if-else code block. If some condition is true, do X, or else do Y instead.

This is the point when in a language class you're tossed with an example that combines some of what you've been introduced explicitly (possibly with new stuff), you're left squinting and trying to make sense of it, and when you finally do you feel having accomplished a lot today:

In [27]:
text = "Among the languages that are spoken today, only few are even tolerably well known to science. Of many we have inadequate information, of others none at all."
# The very first sentence of the chapter "The Languages of the World" in Bloomfield's (1933) _Language_

start_with_a = []
not_start_with_a = []

for word in text.strip().lower().split():
    if word.startswith("a"):
        start_with_a.append(word)
    else:
        not_start_with_a.append(word)

In [28]:
start_with_a

['among', 'are', 'are', 'at', 'all.']

In [29]:
not_start_with_a

['the',
 'languages',
 'that',
 'spoken',
 'today,',
 'only',
 'few',
 'even',
 'tolerably',
 'well',
 'known',
 'to',
 'science.',
 'of',
 'many',
 'we',
 'have',
 'inadequate',
 'information,',
 'of',
 'others',
 'none']

I hope you've found this 10-minute tutorial helpful!