# An Introduction to Computer Science

## What is Computer Science?

Computer science is the study of computing machine. Computer scientists worry about 3 things:
1. What problems can be solved using computation?
2. How to solve those problems?
3. What techniques lead to effective solution?

Turns out that there are a lot of different problems that can be addressed with computation. 

Most people who call themselves computer scientists specialize in a particular area.
1. Systems: The study of large systems
    * e.g. operating systems on our computer, Facebook social network
2. Artificial Intelligence : How to get computer to do things that living things are good at
    * e.g. recognizing faces and photographs
3. Graphics make beautiful movie and video games
4. Security makes sure we can run our business in the internet without NSA snooping around
5. Networking
6. Programming Languages
7. Theory
8. Scientific Computing
9. Etc. And all these other fields that appear when we try to use computing to solve a problem in real world

Within each field, there are subfields

In Artificial Intelligence, there are subfields:
* Decision Making: The best chess, checkers players in the world are computers
* Robotics: building self-driving cars, robots that can do laundry for us
* Natural Language Processing: How to get computers to work with natural languages such as English or Mandarin

And within subfields, there are subsubfields. In Natural Language Processing, there are:
* Translation: Automatically translate from one language to another
* Answering Questions

There are many different things computer scientists do, as there are so many different problems in the world that we want to solve. 

**All computer scientists has one common enemy: `complexity`**

## What is This Course About?
This course is about **managing complexity**, and there is one most important weapon for battling against `complexity`: **`abstraction`**.

#### Mastering Abstraction

`Abstraction` is an idea that we have been doing all the time already. It is when we take a complex system and treat it as one whole thing, give it a name, and not worry about the inside details.

We say "John is our instructor in this course". In actuality, "John is a clothing covering some skins, covering up muscles, bones, cells etc.", which is unbelievably complex. But we just call him "John" and talk about the things that he teaches without worrying about how many carbon atom he has in his body.

We want to learn to use the same idea of `abstraction` as above when we create computer programs

#### Programming Paradigms

This is a broad idea of how to organize programs for take advantage of our `abstraction`s. 

#### Not all about 0 and 1

It's not all about the low level details of what a computer does to manipulate 0s and 1s. This is a common misconception about computer science. What we actually do is express ideas in programming languages.


### This course uses Python
1. Full understanding of the language fundamentals
2. Learning through implementation
3. How computers interpret programming languages

# Expressions

## Type of Expressions

**An expression describes a computation and evaluates to a value.**

Expressions are not something particular to computer science. Mathematicians have been describing how to compose different numbers together for a long time, they invented a lot of different ways to describe computation using expressions:
<img src='expression.jpg' width = 600/>

Turns out one of them is a generalization of all the others, and it's all we need: **function call notation**, `f(x)`.
<img src = 'function.jpg' width = 600/>
Instead of using the symbols, subscripts, superscripts, vertical bars, etc., computer scientists write down everything using `function call notation`.

## Call Expressions in Python

**All expressions can use function call notation**

## Demo

Here we type an expression:

In [1]:
2015

2015

And it shows the value. If we write the expression in 2 different values combined together by addition such as below,

In [2]:
2000 + 15

2015

The result value is what we get by combining the value `2000` and `15`.

Python is also a powerful calculator. We can multiply numbers,

In [4]:
2 * 3

6

Or nesting,

In [5]:
2 * ((3 + 4) * 5)

70

Or we can do the following,

In [13]:
1 * 2 * ((3 * 4 * 5 // 6) ** 3) + 7 + 8

2015

`1 * 2 * ((3 * 4 * 5 // 6) ** 3) + 7 + 8` is an expression, while `2015` is its value

What about call expressions? Call expressions have a special form where we name a function that we want to call. In this case, the `max` function takes in expressions `2` and `4` that will give us the value to which `max` will be applied.

In [14]:
max(2, 4)

4

And we can do the same thing for `min`,

In [15]:
min(-2, 50000)

-2

The whole `max(2, 4)` is a call expression.

Earlier, we claimed that everything can be expressed using a call expression. This is including multiplication, division, addition, etc.

In addition to the symbol `+`, there's a function `add`. This function is not readily available, as we have to write the following to obtain access to `add` and `mul`,

In [2]:
from operator import add, mul

Now we can add together `2` and `3` using call expression.

In [17]:
add(2, 3)

5

And we can multiply together `2` and `3` to get 6.

In [18]:
mul(2, 3)

6

Call expressions can combine multiple values together by using lots of commas,

In [19]:
max(1, 2, 3, 4, 5)

5

Call expressions can also be nested within one another. In this case, we multiply the result of adding `2` and the result of multiplying 4 and 6 with the result of adding `3` and `5`.

In [20]:
mul(add(2, mul(4, 6)), add(3, 5))

208

In call expressions, we don't need to memorize the order of operations. The nesting structure of the expression itself tells exactly what get multiplied before it gets added. For example, Python has to multiply `4` and `6` first, then add the result to `2`.

One other important thing to remember is that multiplication and division comes first before addition and subtraction.

# Anatomy of a Call Expression
Now we're going to talk about the anatomy of a call expression. Here is an example: `add(2, 3)`

In [21]:
add(2, 3)

5

The whole `add(2, 3)` is a call expression. This call expression contains other expressions.

1. Everything that comes before the opening parentheses `(` is `Operator Subexpression`. Subexpression means an expression within an expression.
<img src = 'operator.jpg' width = 500/>

2. The `operands` are separated by commas `,` within parentheses `( )`.
<img src = 'operand.jpg' width = 500/>

**Operators and operands are also expressions**. They altogether evaluate to values.

The way call expressions are evaluated is called **evaluation procedure**.

Programming languages interpret expressions by applying evaluation procedures. They apply the same procedures over and over again.

The **evaluation procedure** for call expressions work like the following,
1. Evaluate the operator and then the operand subexpressions
    * We have the `function` `add` from the `operator` and `argument`s `2` and `3` from the `operand`s
<img src = 'evaluation.jpg' width = 500/>
2. **Apply**:
    * The `function` that is the value of the `operator subexpression` to
    * The `arguments` that are the values of the operand subexpression. 
    
All of this is just to add `2` and `3` together. We go through the steps in such detail to show that the same exact evaluation procedure can be used many times to evaluate more complicated, nested expressions.

## Evaluating Nested Expressions

In [4]:
mul(add(4, mul(4, 6)), add(3, 5))

224

When we run the cell above, Python just give out the result `224`. How did Python do that? Python applied the same evaluation procedure as before. 

Python recognizes that the whole code `mul(add(4, mul(4, 6)), add(3, 5))` is a call expression. 

At first, Python evaluates the `operator`, which is the function `mul` that multiplies,
<img src = 'mul.jpg' width = 550/>
Then Python evaluates the first `operand`, `add(4, mul(4, 6))`, which is another call expression! Thus, Python applies the same evaluation procedure to this operand,
<img src = 'operand_2.jpg' width = 300/>


## `add(4, mul(4, 6))`
In this call expression, Python evaluates the `operator`, the `add` function, and the operands:
1. The number `4`
2. The call expression `mul(4, 6)`
<img src = 'add_4_mul_4_6.jpg' width = 400/>

We have a procedure for evaluating call expressions: evaluate the operator and the operand subexpressions. From there, we get:
1. The function `mul` that multiplies 
2. The numbers `4` and `6`. 

Since we have the operands evaluated, we can apply `mul` to `4` and `6` to obtain 24. 
<img src = '24.jpg' width = 500/>
`24` is the value of the call expression `mul(4, 6)`

Now that we have evaluated all the operands from the subexpression `add(4, mul(4, 6))`, 
<img src = 'add_4_24.jpg' width = 500/>
We can apply `add` to `4` and `24`,
<img src = '28.jpg' width = 500/>

## Back to the main call expression
<img src = 'back_to_main.jpg' width = 500/>

Now that we have the following, we still need to evaluate the second operand. By applying the same evaluation procedure as what we have done so far, we can obtaion the value of the call expression `224`

<img src = '224.jpg' width = 500/>

This whole diagram is called an **expression tree**. It is an illustration of everything that happens within the computer as it evaluates this nested call expression.
<img src = 'expression_tree.jpg' width = 600/>

The important thing in this call expression evaluation is the order in which expressions are evaluated. We have to know that the expression `add(4, mul(4, 6))` evaluates to `28` and the expression `add(3, 5)` evaluates to 8 to finish the computation. 
<img src = 'whole.jpg' width = 700/>

Above, `28` is both **value of subexpression `add(4, mul(4, 6))`** and **the first argument to `mul`**

# Functions, Objects, and Interpreters
This is a demonstration of some Python features. 

Here we will open Shakespeare's plays,

In [5]:
shakes = open('shakespeare.txt')

Here we tell Python to read all the plays that he wrote, and split them into individual words,

In [6]:
text = shakes.read().split()

Now `text` contains Shakespeare's words. The first 25 words are the following,

In [7]:
text[:25]

['A',
 "MIDSUMMER-NIGHT'S",
 'DREAM',
 'Now',
 ',',
 'fair',
 'Hippolyta',
 ',',
 'our',
 'nuptial',
 'hour',
 'Draws',
 'on',
 'apace',
 ':',
 'four',
 'happy',
 'days',
 'bring',
 'in',
 'Another',
 'moon',
 ';',
 'but',
 'O']

"A Midsummer-night's Dream" is the title of the text. Below we can evaluate how many words there are in the text:

In [8]:
len(text)

980637

It turns out the word `the` appears a lot in the text! We can count how many times the word "the" appear,

In [9]:
text.count('the')

23272

One typical word that is used back in Shakespeare-an era is "thou". We can count how many times "thou" appear in the text,

In [10]:
text.count('thou')

4501

And we can try to count other words too,    

In [11]:
text.count('you')

12361

In [13]:
text.count('forsooth')

40

What is the most common appearance in the text? If we see from the first 25 words, seems like the comma `,` appears a lot. We can try counting the number of comma `,` in the text.

In [15]:
text.count(',')

81827

81,827! We can calculate the proportion of comma in the text as well by dividing the count with the total length of the text,

In [16]:
text.count(',') / len(text)

0.0834427010198473

As we can see, the comma `,` just made up about 8% of the text.

We can check if a word is within a text. First of all, we can create a `set`,

In [18]:
words = set(text)

`set` is an unordered collection of unique elements. We can see the documentation below,

In [20]:
set?

Now we can check whether some certain words are in `words`.

In [21]:
'forsooth' in words

True

In [22]:
'the' in words

True

We can see how many unique words are in the text by evaluating the length of `words`.

In [23]:
len(words)

33505

By now, we have covered 2 major themes of the course:

1. Functions, such as `len()`, `open()`
2. Objects
* `set` is an object. It represents and behaves like a set of all words in Shakespeare. 

And last but not least, the programming language itself, which is how we are expressing all the information in the cells above and what are being interpreted by the computer to give us the result that we see from each cell.

Everything that we have run so far in each cell are expressions. We have simple expression like the following,

In [24]:
'draw'

'draw'

We can also have an expression that contain operation,

In [25]:
'draw'[::-1]

'ward'

Above, the `[::-1]` is an operation that reverses a word. We can use this operation in combination inside a bigger expression,

In [26]:
{w for w in words if w == w[::-1] and len(w) > 4}

{'level', 'madam', 'minim', 'redder', 'refer', 'rever'}

Here we obtain the words that are longer than 4 characters and the same forward and backwards.

In [27]:
{w for w in words if w[::-1] in words and len(w) == 4}

{'bard',
 'bats',
 'brag',
 'deed',
 'deem',
 'deer',
 'dial',
 'doom',
 'door',
 'drab',
 'draw',
 'ecce',
 'elle',
 'esse',
 'evil',
 'flow',
 'garb',
 'gnat',
 'gums',
 'hoop',
 'keel',
 'laid',
 'leek',
 'leer',
 'lees',
 'liar',
 'live',
 'loop',
 'maws',
 'meed',
 'meet',
 'mood',
 'moor',
 'nips',
 'noon',
 'part',
 'peep',
 'pins',
 'pooh',
 'pool',
 'poop',
 'port',
 'pots',
 'rail',
 'rats',
 'reed',
 'reel',
 'rood',
 'room',
 'seel',
 'sees',
 'smug',
 'snip',
 'spin',
 'spit',
 'spot',
 'stab',
 'star',
 'stop',
 'swam',
 'tang',
 'teem',
 'tips',
 'tops',
 'trap',
 'trop',
 'trow',
 'ward',
 'wolf',
 'wort'}

Above, we display all words that are 4 characters long and in which the reverse are also contained in `words`. We can see if such thing exist for words with longer characters,

In [28]:
{w for w in words if w[::-1] in words and len(w) == 5}

{'asses',
 'deeps',
 'devil',
 'keels',
 'knits',
 'leets',
 'leper',
 'level',
 'lived',
 'madam',
 'minim',
 'refer',
 'repel',
 'rever',
 'sessa',
 'sleek',
 'speed',
 'spots',
 'steel',
 'stink',
 'stops'}

In [29]:
{w for w in words if w[::-1] in words and len(w) == 6}

{'diaper', 'drawer', 'redder', 'repaid', 'reward'}

In [30]:
{w for w in words if w[::-1] in words and len(w) > 6}

set()

It seems that within Shakespeare's plays, there isn't any word that is longer than 6 words in which the reverse is also contained in the plays.