# On Programming

In [None]:
import sys
from pathlib import Path

current = Path.cwd()
for parent in [current, *current.parents]:
    if (parent / '_config.yml').exists():
        project_root = parent  # ← Add project root, not chapters
        break
    project_root = Path.cwd().parent.parent

sys.path.insert(0, str(project_root))

from shared import thinkpython, diagram, jupyturtle

# Register as top-level modules so direct imports work in subsequent cells
sys.modules['thinkpython'] = thinkpython
sys.modules['diagram'] = diagram
sys.modules['jupyturtle'] = jupyturtle

You need to know a little about a lot of things to become an expert. To become an expert in Python, some of these little stuff are like:

<iframe width="560" height="315"
src="https://www.youtube-nocookie.com/embed/-uleG_Vecis"
  frameborder="0"
  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
  allowfullscreen
  referrerpolicy="strict-origin-when-cross-origin"></iframe>

This chapter introduces the fundamental concepts of programming using Python as our teaching language. Understanding these core concepts is essential for any programmer, as they form the foundation upon which all programming skills are built: 

1. **Programming Language**
2. **Levels of Abstraction**
3. **Interpreted vs Compiled**
4. **Programming Constructs**
5. **Expressions and Statements**
6. **Number Systems**
7. **Resources**

This chapter is all bout the **constructs** of programming. This vocabulary is important: you will need it to understand computational materials, design solutions to computational problems, and communicate effectively with other programmers. Also, these concepts are universal across most programming languages, with syntax and specific implementations may differ. Learning these topics will give you a solid foundation for learning other languages and tackling more advanced programming challenges.

## Programming Language

Learning to program means learning a new way of thinking -- thinking like a computer scientist. This approach combines some of the best features of mathematics, engineering, and the natural sciences. Like mathematicians, computer scientists use formal languages to denote ideas -- specifically computations. Like engineers, they design things, assembling components into systems and evaluating trade-offs among alternatives. Like scientists, they observe the behavior of complex systems, form hypotheses, and test predictions.

### Natural vs Formal Languages
**Natural languages** are the languages that people use to communicate, such as English, Spanish, and French. They were not designed by people; they evolved naturally. **Formal languages** are languages that are designed by people for specific applications. For example, the notation that mathematicians use is a formal language that is particularly good at denoting relationships among numbers and symbols. Similarly, programming languages are formal languages designed to express computations. Although formal and natural languages have some features in common, there are important differences:

* Ambiguity: Natural languages are full of ambiguity, which people deal with by using contextual clues and other information. Formal languages are designed to be nearly or completely unambiguous, which means that any program has exactly one meaning, regardless of context.

* Redundancy: In order to make up for ambiguity and reduce misunderstandings, natural languages use redundancy. As a result, they are often verbose. Formal languages are less redundant and more concise.

* Literalness: Natural languages are full of idiom and metaphor. Formal languages mean exactly what they say.

Because we all grow up speaking natural languages, it is sometimes hard to adjust to formal languages. Formal languages are **denser** than natural languages, so it takes longer to read them. Additionally, the **structure** is important, so it is not always best to read from top to bottom or left to right. Finally, the **details** matter. Small errors in spelling and punctuation, which you can get away with in natural languages, can make a big difference in a formal language.

## Levels of Abstraction
Programming languages exist on a spectrum, ranging from those closest to hardware (machine code) to those closest to high-level languages that are more human-readable, as shown below: 

| Level | Description | Examples | Code Example |
|-------|-------------|----------|--------------|
| **High-Level Languages** | Closest to human language, abstracted from hardware details | Python, Java, JavaScript, C, C++ | `print("Hello, World!")` |
| **Low-Level Languages** | Closer to machine code, direct hardware manipulation | Assembly language | `mov eax, 1`<br>`msg db 'Hello, World!', 0xA` |
| **Machine Code** | Binary instructions directly executed by CPU | Binary (0s and 1s) | `10110000 00000000`<br>`10110001 00001010` |

## Interpreted vs. Compiled 

Programming languages also differ in how they are executed. In the old days, **Interpreted languages** (like Python, JavaScript, and Ruby) were executed line by line by an interpreter at runtime, which translated and ran the code on the fly. **Compiled languages** (such as C, C++, and Rust), on the other hand, require a compilation step in which the entire source code is translated into machine code before execution, resulting in faster runtime performance. Modern time languages (like Python, Java, and C#) use a hybrid approach, compiling code to an intermediate **bytecode** that is then interpreted or just-in-time compiled by a virtual machine, combining the benefits of both models.

## Programming Constructs

A computer program is a set of instructions (written in the specific notations specified by a programming language) given to computers. Interestingly, there are only a few key concepts we need to know when learning to give instructions to computers, applicable to most programming languages. These basic control structure constructs include:

1. **Sequence**: instructions are executed one after another (sequential execution).
2. **Selection**: decision-making/control structure; namely, choosing between alternative paths of actions within a program.
3. **Iteration**: code repetition; either count-controlled or condition-controlled.

In addition to the three basic programming constructs, programming languages have construct elements such as:

1. **Subroutine**: blocks of code (**function**/**method**) in a modular program performing a particular task.
2. **Nesting**: Selection and iteration constructs can be nested within each other.
3. **Variable**: a named computer memory location that stores values
4. Data type (**Type**): a classification of data values specifying the values and operations on the values.
5. **Operator**: symbols that perform operations on one or more operands.
6. **Array**: storing multiple values of the same data type in a single variable, aka, data **collections**.

```{index} expression, statement
```
## Expressions and Statements


In programming languages, expressions and statements are fundamental building blocks for formulating and using the language. 

By definition, an **`expression`** is a combination of values, variables, operators, and function calls that the Python interpreter can **evaluate** to produce a **single value**, which may be assigned to a variable for later use. Note that a **single** literal value, like an integer or string, can be an expression. 

An expression may contain **operators** and **operands**, such as `a + b * c`, as shown below.

:::{figure} ../../images/expression.jpg
:alt: expression
:width: 60%
:align: left

Expression, Operand, and Operator
:::


A **`statement`** is a complete code of instruction for the interpreter to **execute** an action or control the flow of the program. They do not evaluate to a value that can be used elsewhere, like an expression. For example, an *assignment statement* creates a variable and gives it a value, but the statement itself has no value.

Computing the value of an expression is called **evaluation**; whereas running a statement is called **execution**. So, a statement performs an action. An expression **computes** a **value**. For example:

| Type           | Example       | Description                                                         |
| -------------- | ------------- | ------------------------------------------------------------------- |
| **Statement**  | `x = 5`       | Assignment statement: Assigns 5 to `x` (changes program state). Produces **no value**.    |
|                | `print(x)`    | Print statement: Prints something to the screen (has an effect); no value.           |
|                | `if x > 0:`   | `if` statement: Begins a conditional block — a control flow structure; no value. |
|                | `import math` | Import the functionalities from the `math` module; no value. |
| **Expression** | `2 + 3`       | Produces the value `5`.                                             |
|                | `x * y`       | Computes a value based on `x` and `y`.                              |
|                | `len("data")` | Evaluates to `4`.                                                   |


In [2]:
x = 5                # statement: assigns a value; nothing is displayed
x + 5                # expression: evaluates to 10, so the REPL/notebook shows 10
if x > 0:            # statement: controls flow; no value, only the side effect below
    print("x is positive")   # a block of code that executes if the condition is true

x is positive


## Number Systems

```{admonition} Advanced Topic
:class: tip
This section covers number systems (binary, octal, hexadecimal) which is more advanced material. While useful for understanding how computers work at a lower level, it's not essential for writing most Python programs. Feel free to skim this section on first reading and return to it later when you need to work with different number bases.
```

<!-- TODO: add base 16 conversion -->

In programming, number systems are ways of representing numbers using different bases. Computers store and process data in binary, but programmers often use other bases for convenience, readability, or hardware interaction. The four main number systems used in programming are binary (base-2), decimal (base-10), hexadecimal (base-16), and octal (base-8):

| System          | Base | Digits Used | Typical Use                                                | Python Example (all = 100) |
| --------------- | ---- | ----------- | ---------------------------------------------------------- | -------------------------- |
| **Binary**      | 2    | 0–1         | Hardware, CPU, memory, bitwise operations                  | `0b1100100` → 100          |
| **Octal**       | 8    | 0–7         | Unix file permissions | `0o144` → 100              |
| **Decimal**     | 10   | 0–9         | Human-friendly math, user input/output                     | `100` → 100                |
| **Hexadecimal** | 16   | 0–9, A–F    | Memory addresses, colors, debugging, networking     | `0x64` → 100               |

As you can see, binary literals start with `0b`, octal literals start with `0o`, and hexadecimal literals start with `0x`.

All number systems use **positional notation**, where each digit's value in a number depends on its **position** and its **base**. For example, `decimal number 100`, or `100 (base 10)`, can be represented as shown in the table below. Note that each digit represents a different fold of the base and, therefore, its corresponding value. 


| digit |  position | digit x base^position|  | | value     | 
|-------|---|-------|------------|------|-----|
| **3**   | 2 | 3 × 10^2 | = 1 × 100  | =  | **300**  |
| **4**   | 1 | 4 × 10^1 | = 0 × 10   | =  | **40** |
| **5**   | 0 | 5 × 10^0 | = 0 × 1    | =  | **5** |
|       |   |   |        |            |  345 |

Here, you see that the digit 1 in 100 means 100 because it is in the hundred's (10^2 because it's base 10) place. Therefore, we see that:

```
digit value = digit x (base ^ position)
```
You then add all the digit values together to get the value of the number:
```
345 = 3×10² + 4×10¹ + 5×10⁰
```

Following the same process of adding up the digital values, let's say we have a number, `1011 (base 2)`, we can get its decimal value by:

| Digit | Position | 2 to the *n*th Power | Value |
| -------- | ---------- | ----- | ----- |
| 1     | 3        | 2³         | 8     |
| 0     | 2        | 2²         | 0     |
| 1     | 1        | 2¹         | 2     |
| 1     | 0        | 2⁰         | 1     |
|       |          |            | 11    |


So, we can do **base conversion** from (base 2) to (base 10) by:
```
1011₂ = 1x2^3 + 0 x 2^2 + 1 x 2^1 + 1 x 2^0 = 8 + 2 + 1 = 11₁₀

```

Or, let us put the place values at the top, which I prefer:

| Position  | 2^3 | 2^2 | 2^1 | 2^0 |    |
|-- | --- | --- | --- | --- | -- | 
| Place value  | 8   |  4  |  2  |  1  |    |
| Digit  | 1   |  0  |  1  |  1  |    |
| Calculation  | 1×8 |  0×4 | 1×2 |  1×1   |
| Value  | 8   |  0  |  2  |  1  | 11 |

The base 2 system is commonly known as the basis of computing. To count from 0 to 5 (base 10) in binary:

```python
0 = 0b0000
1 = 0b0001
2 = 0b0010
3 = 0b0011
4 = 0b0100
5 = 0b0101
```

To graphically see that the number `100 (base 10)` is equal to `1100100 (base 2)` (or **0b**`1100100`, where **b** stands for binary):

```text
0b1100100
  ││││││└ 0 × 2^0 = 0 × 1  = 0
  │││││└─ 0 × 2^1 = 0 × 2  = 0
  ││││└── 1 x 2^2 = 1 × 4  = 4
  │││└─── 0 × 2^3 = 0 × 8  = 0
  ││└──── 0 × 2^4 = 0 × 16 = 0
  │└───── 1 × 2^5 = 1 × 32 = 32
  └────── 1 × 2^6 = 1 × 64 = 64
                             __
                             100
```

Python has built-in functions `bin()`, `oct()`, `hex()`, and `int()` for base conversion between number systems, which are prefixed by **0x**, **0o**, and **0h**. Note that the `int()` function in this case requires a base. Additionally, Python recognizes other number systems and automatically converts numbers into base 10 when evaluated.

In [None]:
num_b = bin(100)        # '0b11000100'
num_o = oct(100)        # '0o144'
num_h = hex(100)        # '0x64', converted from 100
num_h2 = hex(0b1100100) # '0x64', converted from base 2
num_i_h = int(num_h, 16) # '100'
num_i_b = int(num_b, 2)  # '100'

print(num_b, num_o, num_h, num_h2, num_i_h, num_i_b, sep="\n")

0b1100100
0o144
0x64
0x64
100
100


In [None]:
### Exercise 
# Q1. What's the value of 10 (base 10) in binary? (Print it as a string if you use it as a literal)
# Q2. What's the value of decimal 64 in base 16?
# Try to produce the same output as the cell below. # You may need to use the print() function.
### Your code starts here



### Your code stops here

In [None]:
print(bin(10))  ### or print("0b1010") 
print(hex(64))

0b1010
0x40


### Character Encoding

For computers, the smallest unit of data is a **bit** (Binary Digit). A bit can only be 0 or 1, which can represent off/on, false/true, or no voltage/voltage. A **byte**, on the other hand, is a group of 8 bits, which can represent 2^8, which is 256, different values (0-255), and is the fundamental addressable unit in modern computing. 

Computers only process machine code. For humans to talk to computers, we need something in between that's understood by both, and that is encoding. For example, letter **A** is represented as 65 (base 10) or 0b1000001 in the ASCII (American Standard Code for Information Interchange) code table. ASCII encoding covers English characters (including special characters, numbers, and the alphabet). An early version of the ASCII table is the MIL-STD-188-100:

```{figure} ../../images/ascii-code-chart.png
:name: ascii-code-chart
:alt: ASCII Code Chart 1972
:width: 60%
:align: center

[ASCII Code](https://en.wikipedia.org/wiki/ASCII) Chart 1972
```

In this chart, you can see that letter **A** is of binary bits `1000 001`. When comparing string/character literals, we say that 'B' is greater than 'B' because of the encoding (the ASCII value of 'B' is 66, which is greater than the ASCII value of 'B', 65). 

Since the ASCII code only represents English characters, the [Unicode Standard](https://en.wikipedia.org/wiki/Unicode) and the standard Unicode Transformation Format (UTF) schemes were proposed to support the use of text in all of the world's writing systems that can be digitized; among them, [**UTF-8**](https://en.wikipedia.org/wiki/UTF-8, which is the dominant encoding system for all languages on the internet, and is supported by all modern operating systems and programming languages. 

ASCII uses 1 byte (7 bits originally and 8 bits for extended ASCII) to represent each character for its standard 128 characters, while UTF-8 is variable-length, using 1 to 4 byte code units (8 to 32 bits) to support 1,112,064 code points, while also encoding standard ASCII characters in just 1 byte for backwards-compatibility. With the large number of code points supported, UTF-8 is able to represent emojis and East Asian language characters.

In [None]:
### Exercise 
### What's the decimal code for the letter "C" in the ASCII code? Save it to a variable named c_dec.
### What's the binary code for the letter "C" in the ASCII code? Save it to a variable named c_bin using the bin() function.
### Try to produce the same output as the cell below. Use the print() function and escape sequences.
### Your code starts here




### Your code stops here

67
0b1000011


In [None]:
cc_dec = 67
cc_bin = bin(67)
print(f"The decimal code for the letter \"C\" is {cc_dec}.")
print(f"The binary code for the letter \"C\" is {cc_bin}.")

The decimal code for the letter "C" is 67
The binary code for the letter "C" is 0b1000011


## Resources

**Official Python Documentation**

1. [PEP 20 - The Zen of Python](https://pep.python.org/pep-0020/)
2. [The Python Standard Library](https://docs.python.org/3/library/index.html)
3. [The Python Language Reference](https://docs.python.org/3/reference/index.html)
4. [Python Tutorial (Official)](https://docs.python.org/3/tutorial/index.html)
5. [Python Package Index (PyPI)](https://pypi.org/) - Repository of third-party Python packages

**Style Guides and Best Practices**

1. [PEP 8 - Style Guide for Python Code](https://pep.python.org/pep-0008/)
2. [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html) - Industry-standard style guide used at Google; differs from PEP8 in places such as forbidding wildcard imports (from X import *) and more structured docstrings for functions/methods. 

**Tutorials**

- [Real Python](https://realpython.com/) - Tutorials and articles on Python programming
- [Python for Everybody](https://www.py4e.com/) - Free interactive textbook and course
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/) - Practical programming for beginners

**Interactive Practice**

- [Python Tutor](https://pythontutor.com/) - Visualize code execution step by step
- [LeetCode](https://leetcode.com/) - Coding practice and interview preparation
- [HackerRank Python](https://www.hackerrank.com/domains/python) - Practice problems and challenges