# 0: Introduction
Welcome to this course on Programming and Problem-solving! It exists mainly because my own A Level Computer Science textbook turned out to be rather boring. I'm not sure I can solve that problem for most people since I'm a boring guy, but I'll give it a shot.

Presumably, we learn software programming to ~~learn about problem-solving, develop logical reason skills, and develop a professionally useful skill-set~~ get a computer to do whatever the heck we want it to. If a computer science course can't achieve that, I'm out; and so should you be if it doesn't meet *your* goals (that applies to this Notebook too, but in that case, please tell how I can improve it).

Having said so, there are genuinely interesting things to do here: mysteries to solve, an opportunity for you to use your creativity, and be expressive about your styles and preferences. I hope that you can walk away from this (and your books and teachers and classes) with some curiosity towards and, hopefully, a little love and appreciation about this new set of tools.

## First, some housekeeping!

#### Reading someone else's code is the worst thing in the world
This is a running joke among programmers, but it has a grain of truth. When you write something, you have a certain viewpoint on it that makes it all obvious to you. But not to a stranger. To some extent, by having standards and conventions, we alleviate this problem. However, I still find it weirdly difficult to read a piece of code than to write one—the same is true of mathematical proofs and the like.

To help you there, I'll do my best to group code into small chunks and provide high-level comments that summarise stuff, but I'll need some intellectual investment from your side too (not a lot, though).

#### I'm probably throwing more at you than you need to know
I'm not going to shy away from diving to the details of topics as they come along. Some of these details can be really useful in the process of writing code, especially if you decide to do something of your own outside the exam structure. Many of them are just nuances rather than additional topics, which can be invaluable when tracking down a bug (mistake/fault). But you might never be asked to discuss (about 30% of?) these ideas directly, so feel forget the specifics later down the line.

#### Attitude and other "getting started" advice
As it turns out, the folks who manage the [r/learnprogramming](https://www.reddit.com/r/learnprogramming/) subreddit have done a better job than I can do. So take a look at their [FAQ \(Getting Started\)](https://www.reddit.com/r/learnprogramming/wiki/faq#wiki_getting_started). If you're feeling up for it, go ahead and read the next few questions too.

#### Jupyter
The rest of whatever follows in this Notebook is actual content. Oh did I mention [Jupyter Notebooks](https://jupyter.org/) before? They're the environment we're in right now, and it lets me write stuff in English alongside stuff in programming languages. For the most part you don't have to worry about that, and I'll direct you verbosely through anything you need to do. But if you are interested in playing around more with it, I've made another Notebook about Notebooks, which will also direct you to their official resources.

&nbsp;

Onto the good stuff!

## Logic?
A lot of people think that computer science, science, and math are subjects where you need to use a lot of logic.

Upon [looking that up on Google](https://www.google.com/search?q=what+is+logic), I found the following:
> 1. reasoning conducted or assessed according to strict principles of validity.
> 2. a system or set of principles underlying the arrangements of elements in a computer or electronic device so as to perform a specified task.

As far as I can tell, the first definition is good for [logic puzzles](https://youtube.com/playlist?list=PLJicmE8fK0EiFRt1Hm5a_7SJFaikIFW30) and detective-work. We'll be doing a fair bit of that.

The second one though...really gets me excited. It really captures the essence of programming (introductory programming anyway). What we're going to do is essentially learn about some of these "set\[s\] of principles", so that you can "perform a specified task."

## Yes, logic
What about general people, though? What do we usually mean by saying "something is logical"? Well, one thing we might agree on is that stuff follows from other, well-established, stuff.

Think about the following statements:

1. You should eat healthy meals.
2. Balanced meals are healthy.

So, **logically:**

3. You should eat balanced meals.

It's an obvious conclusion, right? Statement 3 logically follows from statements 1 and 2.

## Getting a computer to do logical work
Let's see if our computer agrees with our logic. I'll tell it these two statements and then ask it for a conclusion.

The cell below is called a "code" cell. Whatever you (or I) type in there will be "told" to the computer. The way you do that is you select the cell (just click on it once), and press **Ctrl** + **Enter** (or **Cmd** + **Return**).

So go ahead and run it! See whether our computer agrees. If it does, this could be your first program! (I mean, I wrote it...but you are still interacting with it!!)

In [None]:
1. I should eat healthy meals.
2. Balanced meals are healthy.

So, should I eat balanced meals?

## :(
That was a weird output! We didn't get an answer, instead the computer said something like this:

> ```py
> Input In [1]
>     1. I should eat healthy meals.
>        ^
> SyntaxError: invalid syntax
> ```

What's syntax? [Acccording to Google](https://www.google.com/search?q=what+is+syntax),
> 1. the arrangement of words and phrases to create well-formed sentences in a language.
> 2. the structure of statements in a computer language.

So it's ironic that the computer forgot to put a space in between `SyntaxError` and then complained about syntax and strucutre.

&nbsp;

&nbsp;

Or *did it* really forget?

## Low-level languages
Well, it turns out, you and I understand English (and/or French, Hindi, Russian, whatever else you know). A computer doesn't.

What it understand probably looks a little like this:
```hex
55
48 89 E5
B8 00 00 00 00
5D
C3
66 2E 0F 1F 84 00 00 00 00 00
0F 1F 44 00 00
```

Okay, that's wayyy too cryptic for us! But this how people coded back in the day. This is [machine code](https://simple.wikipedia.org/wiki/Machine_code), and it's expressed here in a number system called [hexademical](https://simple.wikipedia.org/wiki/Hexadecimal) (the Base16 thing, with 0–9 and A–F). This stuff can get really really complex. Not only do you have to know what a particular hexadecimal value means, but you also need a thorough understanding of computer memory and architecture. Today, only some very experienced engineers ever use hexadecimal.

As computers improved over the years though, we realised we could alleviate some stress off of humans.

So we made this:
```x86
main:
        push    rbp
        mov     rbp, rsp
        mov     eax, 0
        pop     rbp
        ret
```

It's still cryptic. But it's arguably a better cryptic. In general, this is called [assembly language](https://simple.wikipedia.org/wiki/Assembly_language). When the CPU on a computer does something, it follows these instructions. In particular, this is in a "flavour" called `x86_64`. This means that it'll run on an Intel 32-bit (x86) or 64-bit (x64) CPU. The resons 32 bits correspond to the number 86 but 64-bit is 64 itself are [pure history and convention](https://stackoverflow.com/a/29974471/10907280), so deal with it. This is still pretty darn complex. Assembly language is sometimes used in places with very tight constraints (on things like me memory and CPU time).

I claimed that it's a better cryptic. Why? Well, we see some English-ish words, like `main`, `push`, `pop` etc. So, we might be able to guess what this block does. Do you want to take a shot at it?

*\[For CAIE exams, you don't need to know about different kinds of assembly, like x86_64 or AMD_64, but you do need a general understanding of assembly and some commands, as given by the syllabus document].*

## High-level languages
Once again, as computers got better, we realised that we could offload more of the **translation** work to computers. Between 1979, Bjarne Stroustrup created the language [C++](https://www.youtube.com/watch?v=MNeX4EGtR5Y). Our program from the previous cell would look a little like this in C++:

```c++
int main() {
    2 == 3;
}
```

You can probably tell now that it's doing *something* with the numbers 2 and 3, and that's inside *something* called `main`. For now, don't worry about what `int` means, the weird assortment of brackets, or the fact that I've written an equals sign twice (`==`). I *will* tell you that this is doing a **logical operation**, but more on that later.

Machine code and assembly are classified as [low-level languages](https://www.bbc.co.uk/bitesize/guides/z4cck2p/revision/2), while languages like C++ are [high-level langauges](https://www.bbc.co.uk/bitesize/guides/z4cck2p/revision/1). In general, the "lower level" you go, the closer you are to working with "bare metal" (doing physics without abstractions like languages), and the "higer level" you go, you're closer to human-readable languages.

Today, many popular "higher" level languages exist, like Java, JavaScript, C# etc. We'll use one called [Python](https://www.youtube.com/watch?v=x7X9w_GIm1s). It's rather interesting name comes from a comedy show. [From the language's official website](https://docs.python.org/3/faq/general.html#why-is-it-called-python):

> When he began implementing Python, Guido van Rossum was also reading the published scripts from “Monty Python’s Flying Circus”, a BBC comedy series from the 1970s. Van Rossum thought he needed a name that was short, unique, and slightly mysterious, so he decided to call the language Python.

In Python, our program from above would look like this:

```python
2 == 3
```

Neat, isn't it? No funky brackets or semicolons, and no weird words like `int` and `main`.

In fact, remember our previous "program" that gave us a `SyntaxError` without a space? Well, over there, the computer was expecting a program written in Python, not English. Our one-line program with 2 and 3 is a valid Python program. So I'll make it a Python cell below, and you can execute it! (remember **Ctrl** + **Enter** or **Cmd** + **Return**?)

In [None]:
2 == 3

## Analysing the output
If all worked okay, you should have seen a new line appear with a single word: `False`

Congratulations then, on running your first program! By the way, you can edit that code and try something else on your own too. To do so, click the cell and press **Enter** (or **Return**) once. Once you're done with your edits, run it the way you ran it last time.

Can you finally figure out what the program does? It is a **comparision** between 2 and 3. We're essentially asking the question "is 2 equal to 3?" and the equality operator `==` represents this comparision. 2 is obviously not equal to 3, and it tells us as much by saying `False`. As for why they're two equals signs...we'll discuss a little later.

This is an example of a **logical operator**. Just so that you can convince yourself that this is indeed checking for equality, go ahead and replace the `3` above with a `2`, and then run it again (or just run the one below, with different but identical values).

In [None]:
1023 == 1023

## When would this not work?
What do you think would have happened if you ran C++ equivalent instead? Well, I can't do that here for you since this notebook supports only one language at a time (and we're going to use Python). I can, of course, show you code in other languages, but can't run it here. I'll upload all the programs from this notebook into a separate folder with instructions on how to run them.

But for the following code, I can just tell you the output.

```c++
int main() {
    2 == 3;
}
```

You'd see nothing. No `False`, no `SyntaxError`—nothing. Why is that, when we saw an output for Python?

## Ways to translate a language
Part of it has to do with how Python and C++ are "translated" into machine code.

Code written in Python is usually **interpreted**. This means that the computer looks at a line of code, converts it to machine code, and immedeatly runs it. It looks at the next line after that. In essence, that allows Python to be run in a [Read–eval–print loop (REPL)](https://www.digitalocean.com/community/tutorials/what-is-repl) environment. What does that mean? You can give the computer one instruction at a time (read), let it run (evaluate), see its output (print), and then give the next one (loop)—an "interactive" environment.

Code written in C++ is usually **compiled**. In this case, the computer would convert the entirety of the program into machine code at once, and give it to you. Such a compiled program is called an **executable** (denoted by `.exe` on Windows). There usually no REPL environments for such languages.

The "cells" in this Notebook behave like mini, slightly weird, REPL environments. So when we asked it to execute the statement `2 == 3`, the computer ran it and came up with a statement `False`. We describe this using the following terminology: `2 == 3` **returned** `False`. When a piece of code does something, and hands you a value, that action is called returning. We didn't tell the computer what to do with this returned `False` value (such as remembering it for the future), so it went with its default behaviour and just handed it back to us on the screen.

Okay, truth be told, there's more to returning than passing values around, and the above description very incomplete, but it's a good rule of thumb for getting code to actually do something, and I promise to resolve this lie before long.

You can [read more about different types of translators here](https://www.bbc.co.uk/bitesize/guides/z4cck2p/revision/3).

### SIDENOTE: Justification for discussing both Python and C++
So far, I've mostly been talking about code in two high-level languages. You may wonder why, given that your syllabus requires only one of them (likely Python).

I actually have several reasons:
* **C++ is lower level than Python.** There is no magic in C++ (or its predecessor C). If you want something done, you'll have to do it yourself. I'm borrowing this idea from [Harvard University's CS50x course](https://cs50.harvard.edu/x/), which is taught primarily in C. I think you might have a better handle on programming in general (which *will* help with exams) if you see the gory details of C++. Having said that, you don't *need* to remember it at all—just make sure you get the concept, then you can stick to Python.
* **Programming is supposed to be a transferable skill.** The syllabus expects you to understand and appreciate different "paradigms" (in paper 4). I think this idea will be refined if you are exposed to different languages—it would allow you to focus on the bigger picture and appreciate the general takeaways. Besides, [recent research suggests that we learn the best when exposed to various perspectives](https://www.youtube.com/watch?v=rhgwIhB58PA).
* **Pseudocode appears to follow syntax closer to C++.** On one hand, yes, CAIE Pseudocode is more like Python in that it it does away with the verbose brackets and semicolons. But many of its concepts (for instance: types of loops, file handling, variable declaration etc) seem to resemble C++ more closely, as you'll discover later.

So, please forgive me for adding another language for you to deal with.

# 1: Introducing the print statement
Now we're done with the majority of things I would consider background or context, and even looked at some ideas that are required by the syllabus. Most of whatever follows will be "concrete" content, and you ~~can~~ should immediately try to implement it on your own.

## So how do we tell a computer to "output" something?
Many tutorials start by adressing this topic, and the [convention](https://stackoverflow.com/questions/602237/where-does-hello-world-come-from) is to output the message "Hello, World!" or some variation thereof. But we already have our comparision of 2 and 3, so let's try and output that.

If we attempt to output the result of `2 == 3`, our programs change like so:

#### Python

```python
print(2 == 3)
```

#### C++
```c++
#include <iostream>

int main() {
    printf("%i", 2 == 3);
}
```

&nbsp;

You can run the Python program below and see its output. For the C++ program, I'll just tell you what the output was:

```
0
```

In [None]:
print(2 == 3)

## Differences
Okay, so that changed our code drastically—especially in C++. In particular:
* There is now a new line `#include <iostream>`.
* A word called `printf` has entered the chat.
* `"%i"` is now a thing, whatever that means.
* There is some stuff in parenthesis `()` and separated by a comma.

In Python, we observe only two changes:
* The `2 == 3` is now inside parenthesis.
* It is preceeded by a word `print`.

## Similarities
* The word `print` appears in both cases, though it has an `f` in C++.
* `2 == 3` goes inside parenthesis `()` in both cases.

## So, what are we printing out?
You didn't see anything on your printer upon running this, right? Turns out, it's called "print" (instead of something like "output") for historic reasons. Before we could use monitors and screens, computers would literally print on a piece of paper.

The term `print` is something called a **function.** A function is like a shortcut to a (usually longer) piece of code that does something for you. The people that made Python and C++ wrote some code that would show something on your screen, and called it `print` and `printf` respectively.

So our `2 == 3` statement **returns** the `False` value, and it is **passed** into the `print` or `printf` functions. The function then takes that and outputs it to the screen. We'll discuss functions in more depth later, when we're writing long enough code to warrant use of functions.

Adressing the extra `f` in C++: the `f` in `printf` is for "formatting", and it is the answer to why we have the weird `"%i"` in our program. But to understand it, we'll need to examing a different concept first.

## Differring outputs
Did you notice that C++ and Python printed out different things (`0` and `False` respectively)?

To understand that, let's also take a look at a slight different code, where we're checking identical values. I'll show you the C++ code and its corresponding output here, and you can run the Python equivalent below.

#### C++ code

```c++
#include <iostream>

int main() {
    printf("%i", 19 == 19);
}
```

#### C++ output
```
1
```

In [None]:
print(19 == 19)

## What if I want to print a message?
Remember I told you about "Hello, World!" earlier? What if I wanted to print a message like that instead? Here's an attempt at that in C++, using whatever we know so far.

#### C++ code attempt

```c++
#include <iostream>

int main() {
    printf("%i", Hello, World!);
}
```

Huh! I could not compile that all! It gave me an error:
```
exp.cpp: In function ‘int main()’:
exp.cpp:4:18: error: ‘Hello’ was not declared in this scope; did you mean ‘ftello’?
    4 |     printf("%i", Hello, World!);
      |                  ^~~~~
      |                  ftello
exp.cpp:4:25: error: ‘World’ was not declared in this scope
    4 |     printf("%i", Hello, World!);
      |                         ^~~~~
```

Our program seems reasonable. So, just to make sure it's not a weird quirk of C++, let's try it in Python too.

In [None]:
print(Hello, World!)

# 2: Data types

## So, what went wrong?
Well here's a summary of what happened.

| Input to `print` or `printf` | C++ output | Python output |
| -- | -- | -- |
| `2 == 3` | `0` | `False` |
| `19 == 19` | `1` | `True` |
| `Hello, World!` | *[error]* | *[error]* |

There are two noteworthy things here.
1. The difference (C++ us gives a number like `1`, and Python gives us a word like `True`).
2. Our program was unable to say "Hello" to the world :(

Both of these can be explained using the concept of **data types**. It is already obvious to you that computers can remember data. It typically ranges from documents and text, to images and video, to games and apps.

Well, in programming too, it turns out there is a special way of doing this same task. You can, of course, store stuff to **files** and retrieve it from there (think of this as typically happening on the "long-term" storage device, like hard drives), but you can *also* put stuff onto temporary memory (think of this as the RAM).

Just like you can control the type of a file (a filename ending with `.jpg` will conventionally be an image, a filename ending with `.docx` will typically be a Word document et cetera), so can you control the type of this temporary data.

## Important data types
There are 8 important **primitive** (i.e., they're built into the language) data types, and I'll list them below. I've also included what I'm calling a "symbol" for now, but we'll see its utility later.

| Name of type | Python symbol | C++ Symbol | Description
| :-- | :-- | :-- | :-- |
| `INTEGER` | `int` | `int` | An integer (it cannot have precision after the decimal point). |
| `REAL` | `float` | `double` or `float` | A real number (it can have precision after the decimal point). |
| `CHAR` | *[N/A]* | `char` | A single character. |
| `STRING` | `str` | array of `char` | A number of characters. |
| `BOOLEAN` | `bool` | `bool` | A Boolean value, storing only `TRUE` or `FALSE`. |
| `DATE` | `datetime.date` | *[N/A]* | A date, typically stored along with a time. |
| `ARRAY` | `list` | *[N/A]* | A number of values of any other data type. |
| `FILE` | file object | `FILE` | An "object" referring to a file. |

Okay, to be honest with you:
* Not all of these are primitives (built into) in Python and/or C++.
* Some of them really make sense only after we examine an A Level concept called Object Oriented Programming.

You can assume that they are part of valid pseudocode, though. And there are ways to implement all of them in both Python and C++.

## Umm, where were we again?
Well, I had pointed out to you that Python and C++ give different outputs (`False` vs `0` for instance).

It turns out that our comparison `2 == 3` returned a **Boolean value**, i.e., it can only ever be true or false. The `print` function of Python knows how to read the value and covert it into a string. For example, it reads the value `False`, and understands that it must be printed as the letters 'F', 'a', 'l', 's', and 'e'.

In the case of `printf` in C++, it does not automatically know this. We have to tell it what data type want to output manually (no magic, remember?), with a **format code**. In this case, the format code was `%i`.

Confused? Look at the diagrams below.

![How C++ printf() works](assets/printf.png)

![How Python print() works](assets/print.png)

However, a Boolean format code defined for `printf` does not happen to exist. So, I told the function to convert this to an integer instead when I specified `"%i"`.

If you know about binary, you might know that the number 0<sub>10</sub> is written as 0<sub>2</sub>, 1<sub>10</sub> is written as 1<sub>2</sub>, 2<sub>10</sub> is written as 10<sub>2</sub>, and so on. A binary 1 can be represented as a `True` and a binary 0 can be represented as a `False`.

So, when we wrote `False`, it got converted to an integer `0`. Likewise, `True` got converted to an integer `1`.

### C++ format codes
A C++ format code has the format `%x`, where `x` is some character. There are many format codes available, and Mircosoft has a [document with many of them explained](https://docs.microsoft.com/cpp/c-runtime-library/format-specification-syntax-printf-and-wprintf-functions#type-field-characters) if you're interested in learning more about them.

However, know that this is here for your information only, and will not be examined on the syllabus.

## The trouble with "Hello, World!"
Okay, so why did this not work? It turns out that computers need a way to distinguish **literals** from **tokens**.

According to [Wikipedia](https://en.wikipedia.org/wiki/Literal_(computer_programming)):
> In computer science, a literal is a notation for representing a fixed value in source code.

A token can be thought of as an element of the programming laguage. It could be a function like `print`, or a symbol like `==`. In contrast, a literal is something the programmer put in there, like a string or a number.

Numbers are easy to distinguish, since a tokens whose name is a number is usually not allowed. However, a special character is needed to distinguish a STRING or a CHARACTER. For example, you could have a string literal with the characters 'p', 'r', 'i', 'n', and 't' (the word "print").

We do this using quation marks. In C++, a single CHARACTER is enclosed in single quotes (like `'a'`) and a STRING is enclosed in double quotes (like `"Hello, World!"`). In Python, there is no difference between the two and both use the datatype `str`. In fact, Python does not have a datatype that is equivalent to `char` in C++, and everything is a STRING (though you could always just put one character into a Python STRING).

## Trying to print a string
So, with our newfound knowledge of quotes and strings, let's try to print strings. As usual, I'll give you the C++ code and output, and leave you to run the Python code below.

#### C++ code
```c++
#include <iostream>

int main() {
    printf("%s", "Hello, World!");
}
```

#### Output
```
Hello, World!
```

In [None]:
print("Hello, World!")

## Removing redundancies
Most programming tutorials would have you up and running with "Hello, World!" in matter of seconds. I dragged you all the way here before accomplishing that.

However, Do you notice anything odd about our print statement in C++?

Well, it appears that the format code (`%i` or `%s`) is also enclosed in double quotes. So is that also a STRING? Well, yes. `printf()` can print STRING and CHARACTER directly, but you need to provide a format code for other values. So if we shorten our program (as below), it would still run the same.

```c++
#include <iostream>

int main() {
    printf("Hello, World!");
}
```

In general, it is a bad idea to have **redundancies** in programming. Since computers are soooo good at doing things repeatedly, it's usually considered a bad sign if *you* are manually doing something repeatedly or going through unnecessary steps. That said, programmers are infamous for using workarounds and hacks all the time!

# 3: Going back to logic
We started our discussion by asking the computer about balanced meals and whether 2 equals 3. Let's now try to do something with that.

## Logic...so what?
I claimed that computers are logical devices, but we didn't really go anywhere from that. Usually, we use logical reasoning to **decide** or conclude something. For example, we could use data about meals to conclude what kind of meals we should eat.

Actually, we do this all the time:
1. You might buy a carton of milk if you've run out of it at home.
1. If one brand of pencils is cheaper than another (but they're equivalent otherwise), I'll buy the cheaper one.
1. Cars go through an intersection while the light is green.
1. A biologist and a physicist might breakup if there is no chemistry between them.
1. You should see a doctor if you are unwell.

Hold on! Aside from the one about pencils, most of these aren't really using much logic right? I may have to read and think a bit about it, but most of the others are just obvious.

Well, let's look at [a narrower definition of a **logical operation**](https://www.computerhope.com/jargon/l/logioper.htm#:~:text=A%20logical%20operation%20is%20a,or%20more%20phrases%20of%20information.&text=In%20computing%2C%20logical%20operations%20are,operations%20are%20called%20boolean%20operations.):
> A **logical operation** is a special symbol or word that connects two or more phrases of information.

Fundamentally, this is about logic gates. For the most part, when we talk about "logic" in any sort of programming, it can be reduced to a (often very complicated) series of logic operations.

&nbsp;

Specifically, let's try framing the first of these examples as a series of steps.
> You might buy a carton of milk if you've run out of it at home.

1. Go home.
1. Check the amount of milk in stock at the home.
1. If the amount is insufficient,
    * Go to a store to buy one carton.

