<div style="text-align:center;">
    <img src="http://www.cs.wm.edu/~rml/images/wm_horizontal_single_line_full_color.png">
    <h1>CSCI 312-01, Fall 2025</h1>
    <h1>Effective C, Chapter 1</h1>
    <h1>Getting started with C</h1>
</div>

# Contents

* [Hello, world!](#Hello,-world!)
* [Compilation and linkage](#Compilation-and-linkage)
    * [Compilation and linkage in one step](#Compilation-and-linkage-in-one-step)
    * [Compilation and linkage in multiple steps](#Compilation-and-linkage-in-multiple-steps)
* [Statements and comments](#Statements-and-comments)
* [The C standard](#The-C-standard)
    * [Implementation-defined behavior](#Implementation-defined-behavior)
    * [Unspecified behavior](#Unspecified-behavior)
    * [Undefined behavior](#Undefined-behavior)
    * [Locale-specific behavior and common extensions](#Locale-specific-behavior-and-common-extensions)

# Python vs. C vs. C++ vs. Java

|           | Python          | C            | C++   | Java  |
| :--------:| :-------------: | :----------: | :---: | :---: |
| printing to standard output | `print()`   | `printf()` | `std::cout <<`<br/>`std::println()` | `System.out.println()`<br/> `System.out.printf()` |
| string literals | `'boo!'` or `"boo!"` | `"boo!"` | same as C | same as C |

# Hello, world!

We begin with <a href="https://en.wikipedia.org/wiki/%22Hello,_World!%22_program">"Hello, world!"</a>.  This program simply prints "Hello, world!" (without the quotation marks) to the screen and stops.  The actual C code is in a file named `hello.c`; here we use the &#42;nix command `cat` to show the file.

In [2]:
cat -n src/hello.c  # cat is a command to concatenate and print files; 
                     # the -n option gives us line numbers.

     1	// Hello, world!
     2	#include <stdio.h>
     3	int main(int argc, char **argv)
     4	{
     5	  printf("Hello, world!\n");
     6	  return 0;
     7	}


As you can see, this is considerably more complicated than the Python equivalent,
<p style="text-indent: 4em"><code>print("Hello, world!")</code></p>

Let's delve into the code.

**Line 1.**  This is a C preprocessor directive.  The file `stdio.h` is called a **header file** (hence the suffix `.h`) since it appears at the head of the file.  The name `stdio.h` comes from the fact that this file contains information about **standard i/o**.  The code in this file is inserted by the C preprocessor at the location of the `#include` directive.  This particular file contains, among many other things, a **function prototype** for the function `printf()` describing the number and type of inputs to the function and the type of the value returned by `printf()`.  The function `printf()` returns an integer that is the number of bytes successfully printed (14, in this case).

**Line 3.**  The function `main()` is where execution begins.  In Python this role is played by the file for which 
<p style="text-indent: 4em"><code>__name__ == __main__</code></p>

**Every C/C++/Java program must have a `main()`.**

The `int` in front of `main()` is a **type declaration**.  Here it says that `main()` is a function that returns a value of type `int` (integer).  There are also two inputs to `main()`, an `int` named `argc` and a variable of type `char**` (we will explain this later) named `argv`.  The argument `argc` is the number of elements in `argv`, while `argv` is an array of strings that contains the command line arguments; Python's [`sys.argv`](https://docs.python.org/3/library/sys.html#sys.argv) is the equivalent (Python borrowed the idea from C).

**Lines 4 and 7.**  In C/C++ the body of a function is enclosed in squiggly brackets `{ }`, whereas in Python it is indented.  **Indentation has no syntactic meaning in C/C++.**.

**Line 5.**  Here `printf()` is a function similar to Python's `print()` function. Unlike Python's `print()` the `printf()` function in C does not include a newline by default so we must do so ourselves with `\n`.  The escape characters (e.g., `\t` for tab) are the same in C/C++/Java as in Python.

Strings in C/C++/Java are delimited by double quotes: `"Hello, world!\n"`.  Single quotes are not an option as they are in Python.

**Line 6.**  Finally, the `return` statement is familiar from Python.  Here the return value is returned to the program's execution environment.

<img src="https://www.cs.wm.edu/~rml/images/danger.svg" style="height: 30px"/>  We will use the K&amp;R convention of writing parentheses after the names of functions, e.g., <code>printf()</code>, to indicate we are talking about a function.

# Compilation and linkage

We will need to build a binary executable from our C source file.  This is another way in which C differs from Python.

There are two steps:
1. **compilation**, which translates our C source file `hello.c` into a binary **object file**, and
2. **linkage**, which takes the object files as well as any library object files that are needed and stitches them together into a binary executable.

Part of the fun of compiled languages is that you can run into problems at both steps.

## Compilation and linkage in one step

First we use the GNU C compiler `gcc` to perform both compilation and linkage in one invocation.  The convention in &#42;nix is that options for commands are specified with flags that begin with `-` ("dash" or "minus") or `--` ("dash dash" or "minus minus").  We are specifying the options
* `-Wall -pedantic`, which means turn on all warnings (`-Wall`) and pedantic cautions (`-pedantic`), and 
* `-o hello`, which means "create the output file `hello`".

Here we use the GNU C compiler `gcc` to perform both compilation and linkage in one invocation.  The convention in &#42;nix is that options for commands are specified with flags that begin with `-` ("dash" or "minus") or `--` ("dash dash" or "minus minus").  We are specifying the options
* `-Wall -pedantic`, which means turn on all warnings (`-Wall`) and pedantic cautions (`-pedantic`), and 
* `-o hello`, which means "create the output file `hello`".

In [4]:
gcc -Wall -pedantic -o hello src/hello.c  

Let's confirm there is a newly created binary named `hello`.

In [6]:
date          # Print current date and time.
ls -ls hello  # List information about the file hello, including last modified date.
file hello    # Check what type of file hello is.

Thu Sep  4 17:56:00 EDT 2025
72 -rwxr-xr-x  1 rml  staff  33432 Sep  4 17:55 hello
hello: Mach-O 64-bit executable arm64


As you can see, the file `hello` was just created.

The `file` command tells us that the executable is for a 64-bit ARM architecture running the Mach kernel.  Mach is at the core of the MacOS operating system.

If you omit the `-o` option the executable will be named `a.out` by default:

In [8]:
gcc -Wall -pedantic src/hello.c 

Let's check that the executable was just built:

In [10]:
date          # Print current date and time.
ls -ls a.out  # List information about the file hello, including last modified date.

Thu Sep  4 17:56:15 EDT 2025
72 -rwxr-xr-x  1 rml  staff  33432 Sep  4 17:56 a.out


## Compilation and linkage in multiple steps

We can also stop at the compilation stage by specifying the `-c` option.  This will create an object file named `hello.o`.

In [11]:
gcc -Wall -pedantic -c src/hello.c

In [12]:
date  # The time and date.
ls -ls hello.o  # List information about hello.o.
file hello.o  # What type is the file hello.o.

Thu Sep  4 17:56:20 EDT 2025
8 -rw-r--r--  1 rml  staff  752 Sep  4 17:56 hello.o
hello.o: Mach-O 64-bit object arm64


We can then start with the object file `hello.o` and carry out the linkage step.

In [13]:
gcc -o hello hello.o

In [14]:
date          # Print date and time.
ls -ls hello  # List information about the file hello, including last modified date.
file hello    # Check what type of file hello is.

Thu Sep  4 17:56:25 EDT 2025
72 -rwxr-xr-x  1 rml  staff  33432 Sep  4 17:56 hello
hello: Mach-O 64-bit executable arm64


Later we will look at `make`, a simple &#42;nix tool for automating builds like this.

# Statements and comments

**In C statements are terminated by a semicolon**.  The fact that statements in C are terminated by a semicolon (rather than the end of a line, which is typical in Python) means that it is easy to split statements over multiple lines.

This is what happens if you omit a semicolon:

In [17]:
cat -n src/no_semicolon.c

     1	// Statements must end with a semicolon.
     2	#include <stdio.h>
     3	int main(void)
     4	{
     5	  printf("Hello, world!\n")
     6	  return 0
     7	}


In [18]:
gcc -Wall -pedantic -c src/no_semicolon.c

[1msrc/no_semicolon.c:5:28: [0m[0;1;31merror: [0m[1mexpected ';' after
      expression[0m
    5 |   printf([0;32m"Hello, world!\n"[0m)[0m
      | [0;1;32m                           ^
[0m      | [0;32m                           ;
[0m[1msrc/no_semicolon.c:6:11: [0m[0;1;31merror: [0m[1mexpected ';' after return
      statement[0m
    6 |   [0;34mreturn[0m [0;32m0[0m
      | [0;1;32m          ^
[0m      | [0;32m          ;
[0m2 errors generated.


: 1

There are two ways to indicate comments in C:
* `//` acts like `#` in Python: everything from `//` to the end of a line is a comment;
* block comments (one or more lines) are delimited by `/*  */`.

`// I am a C comment.`

`/* I am also a C comment. */`

<pre>
/*
I
am
a
multiline
C
comment.
*/
</pre>

The second example above shows how to quickly disable blocks of code by commenting them out, in the same way you can enclose Python code in '''  ''' to turn it into a string that is ignored.

A popular style for block (multiline) comments is to begin each line with a star:
<pre>
/*
 * I prefer formatting multiline comments this
 * way so that the body of the comment is clear.
 */
</pre>

You cannot nest block comments:

In [20]:
cat -n src/nested_comments.c

     1	// You cannot nest block comments.
     2	#include <stdio.h>
     3	int main(int argc, char **argv)
     4	{
     5	  /* You cannot /* nest block comments */. */
     6	  return 0;
     7	}


In [21]:
gcc src/nested_comments.c

      comment [-Wcomment][0m
    5 |   [0;33m/* You cannot /* nest block comments */[0m. */[0m
      | [0;1;32m                ^
[0m[1msrc/nested_comments.c:5:42: [0m[0;1;31merror: [0m[1mexpected expression[0m
    5 |   [0;33m/* You cannot /* nest block comments */[0m. */[0m
      | [0;1;32m                                         ^


: 1

# The C standard

The C standard is **extremely** permissive and an implementation of C can be a Choose Your Own Adventure story.

This stems from the wide variety of architectures in which C has operated over the past 50+ years.

In addition, there are a number of dialects of C (e.g., GNU C) that contain non-standard features.

Here we call attention to portability of C code between different architectures.  The following descriptions are taken from *Effective C*.

## Implementation-defined behavior

<blockquote>
<b>Implementation-defined behavior</b> is program behavior that is not specified by the C standard and that may produce different results between implementations but has consistent, documented behavior within an implementation. An example of implementation-defined behavior is the number of bits in a byte.
</blockquote>

🤯🤯🤯🤯 Yes, the number of bits in a byte may (and does) vary between C implementations. 🤯🤯🤯🤯

## Unspecified behavior 

<blockquote>
<b>Unspecified behavior</b> is program behavior for which the standard provides two or more options but doesn’t mandate which option is chosen in any instance. Each execution of a given expression may yield different results or produce a different value than a previous execution of the same expression. An example of unspecified behavior is function parameter storage layout, which can vary among function invocations within the same program.
</blockquote>

## Undefined behavior

<blockquote>
<b>Undefined behavior</b> is behavior that isn’t defined by the C standard or, less circularly, “behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which the standard imposes no requirements” (ISO/IEC 9899:2024). Examples of undefined behavior include signed integer overflow and dereferencing an invalid pointer value. Code that has undefined behavior is often incorrect, but not always. </blockquote>

🐞🐞🐞🐞 You definitely want to avoid undefined behavior. 🐞🐞🐞🐞

## Locale-specific behavior and common extensions

<blockquote>
<b>Locale-specific behavior</b> depends on local conventions of nationality, culture, and language that each implementation documents.  <b>Common extensions</b> are widely used in many systems but are not portable to all implementations.
</blockquote>