We already know what a string is. It is an array of non-null characters terminated by a null byte.
-
"xxx"
is called a string literal or a string constant- do not confuse it with a character constant, e.g.
'A'
, as it uses single quotes. In contrast to Python, for example, single and double quotes are two different things in C.
- do not confuse it with a character constant, e.g.
-
a string constant internally initializes an array of characters, with a null character appended.
-
we already know that a string is a contiguous sequence of characters terminated by and including the first null character
- so, a string literal may include multiple null characters, thus defining multiple strings. In other words, a string literal and a string are two different things.
$ cat main.c
#include <stdio.h>
int
main(void)
{
printf("%s\n", "hello\0world.");
}
$ cc main.c
$ ./a.out
hello
-
note that
'\0'
is just a special case of octal representation'\ooo'
, called octal escape sequence, whereo
is an octal figure (0-7). You can use any ASCII character like that.- the full syntax is
'\o'
,'\oo'
, or'\ooo'
- the full syntax is
$ cat main.c
/*
* The code assumes that our environment uses ASCII but that is not mandatory.
* See 5.2.1 Character Sets in the C99 standard for more information.
*/
#include <stdio.h>
int
main(void)
{
/* Check the ascii manual page that 132 is 'Z', and 71 is '9'. */
printf("%c\n", '\132');
printf("%c\n", '\71');
/* Will print "a<tab>b" */
printf("a\11b\n");
}
$ ./a.out
Z
9
a b
- a string constant may be used to initialize a char array and usually that is
how the string initialization is used (in contrast to
{ 'a', 'b', ... }
)
int
main(void)
{
char s[] = "hello, world.";
printf("%s\n", s);
}
$ gcc -Wall -Wextra main.c
$ ./a.out
hello, world.
- remember that if
{}
is used for the initialization, you must add the terminating zero yourself unless you use the size of the array and the string was shorter (in which case the rest would be initialized to zero as usual):
char s[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
- so, the following is still a propertly terminated string but do not use it like that
char s[10] = { 'h', 'e', 'l', 'l', 'o' };
Anyway, you would probably never do it this way. Just use = "hello"
.
π array-fill.c π array-fill2.c
- string is printed via
printf()
as%s
printf("%s\n", "hello, world");
- experiment with what
%5s
and%.5s
do (use any reasonable number)
π string-format.c
- π§ extend the program π tr-d-chars.c from last time to translate character input, e.g.
tail /etc/passwd | tr 'abcdefgh' '123#'
- use character arrays defined with string literals to represent the 2 strings
- see the tr(1) man page on what needs to happen if the 1st string is longer
than the 2nd
- do not store the expanded 2nd string as literal in your program !
(i.e. not
123#####
)
- do not store the expanded 2nd string as literal in your program !
(i.e. not
- see the tr(1) man page on what needs to happen if the 1st string is longer
than the 2nd
π tr.c
π§ (bonus): refactor the code into a function(s)
- remember that arrays are passed into a function as a pointer (to be explained soon, not needed now) which can be used inside the function with an array subscript.
Formally defined as:
for (<init>; <predicate>; <update>)
<body>
It is the same as:
<init>
while (<predicate>) {
<body>
<update>
}
Using the for loop is very often easier and more readable.
Example:
int i;
for (i = 0; i < 10; ++i) {
printf("%d\n", i);
}
Or in C99:
for (int i = 0; i < 10; ++i) {
printf("%d\n", i);
}
-
the
break
statement terminates execution of the loop (i.e. it jumps right after the enclosing}
) -
with the
continue
statement, the execution jumps to the end of the loop body (i.e. it jumps at the enclosing}
). That means the execution continues with the<predicate>
test. -
both
break
andcontinue
statements relate to the smallest enclosing loop they are executed in
π for.c
π§ task: compute minimum of averages of lines in 2-D integer array (of arbitrary dimensions) that have positive values (if there is negative value in given line, do not use the line for computation).
π 2darray-min-avg.c
"In mathematics, an expression or mathematical expression is a finite combination of symbols that is well-formed according to rules that depend on the context."
In C, expressions are, amongst others, variables, constants, strings, and expressions in parentheses. Also arithmetic expressions, relational expressions, function calls, sizeof, assignments, and ternary expressions.
http://en.cppreference.com/w/c/language/expressions
In C99, an expression can produce results (2+2 gets 4) or generate side effects
(printf("foo")
sends a string literal to the standard output).
π§ task: make the warning an error with your choice of compiler (would be
a variant of -W
in GCC)
Statements are (only from what we already learned), expressions, selections
(if
, switch
(not introduced yet)), {}
blocks (known as compounds),
iterations, and jumps (goto
(not introduced yet), continue
, break
,
return
).
http://en.cppreference.com/w/c/language/statements
Basically, statements are pieces of code executed in sequence. The function body is a compound statement. The compound statement (a.k.a. block) is a sequence of statements and declarations. Blocks can be nested. Blocks inside a function body are good for variable reuse.
A semicolon is not used after a compound statement but it is allowed. The following is valid code then:
int
main(void)
{
{ }
{ };
}
A declaration is not a statement (there are subtle consequences, we can show them later).
Some statements must end with ;
. For example, expression statements. The
following are all valid expression statements. They do not make much sense
though and may generate a warning about an unused result.
/* this one is not a statement */
char c;
/* these are all expression statements */
c;
1 + 1;
1000;
"hello";
π null-statement.c π compound-statement.c
- It is not allowed to have a compound statement within an expression.
- That said, GCC has a language extension (gcc99) that can be used to allow this - the reason is for protecting multiple evaluation within macros.
- The C99 standard does not define it.
- gotcha: the following code has to be compiled with
gcc
with the-pedantic-errors
option in order to reveal the problem
gcc -std=c99 -Wall -Wextra -pedantic-errors compound-statement-invalid.c
compound-statement-invalid.c: In function βmainβ:
compound-statement-invalid.c:8:6: error: ISO C forbids braced-groups within expressions [-Wpedantic]
8 | if (({ i *= 2; puts("doubled");}), i % 2 == 0) {
| ^
Our recommendation is to always use these options.
π compound-statement-invalid.c
Motivation:
-
memory allocation / shared memory
-
protocol buffer parsing
-
pointer arithmetics
-
the value stored in a pointer variable is the address of the memory storing the given pointer type
-
declared like this, e.g.
int *p; // pointer to an int
- note on style:
int * p;
int *p; // preferred in this lecture
-
a pointer is always associated with a type
-
to access the object the pointer points to, use a dereference operator
*
:
printf("%d", *p);
- the dereference is also used to write a value to the variable pointed:
*p = 5;
- in a declaration, you may assign like this:
int i = 5;
int *p = &i;
- good practice is to prefix pointers with the 'p' letter, e.g.:
int val;
int *pval = &val;
-
the
&
is an address-of operator and gets the address of the variable (i.e. the memory address of where the value is stored) -
the pointer itself is obviously stored in memory too (it's just another variable). With the declarations above, it looks as follows:
p
+---------------+
| addr2 |
+---------------+ i
^ +-------+
| | 5 |
addr1 +-------+
^
|
addr2
-
the size of the pointer depends on the architecture and the way the program was compiled (see
-m32
/-m64
command line switches of gcc) -
sizeof (p)
will return the amount of memory to store the address of the object the pointer points to -
sizeof (*p)
will return the amount needed to store the object the pointer points to
π§ write a program to print:
- address of the pointer
- the address where it points to
- the value of the pointed to variable
Use the %p
formatting for the first two.
π ptr-basics.c
- the real danger with pointers is that invalid memory access results in a crash (the program is terminated by kernel)
- can assign a number directly to a pointer (that should trigger a warning though. We will get to casting and how to fix that later).
int *p = 0x1234;
-
zero pointer value is called a null pointer constant and is defined as a macro
NULL
in the C specification.NULL
is converted to a null pointer which is guaranteed in C not to point to any object or function. In other words, dereferencing a null pointer is guaranteed to terminate the program.-
this is because zero address is left unmapped on purpose, or a page that cannot be accessed maps to the address.
-
the C specification says that the macro
NULL
must be defined in<stddef.h>
-
π§ create the null pointer and try to read from it / write to it
π null-ptr.c
- notice the difference:
int i;
int *p = &i;
vs.
int i;
int *p;
// set value of the pointer (i.e. the address where it points to).
p = &i;
- operator precedence gotcha:
*p // value of the pointed to variable
*p + 1 // the value + 1
*(p + 1) // the value on the address + 1 (see below for what it means)
- note that the & reference operator is possible to use only on variables
- thus this is invalid:
p = &(i + 1);
- store value to the address pointed to by the pointer:
*p = 1;
Pointers can be moved forward and backward
p = p + 1;
p++;
p--;
The pointer is moved by the amount of underlying data type when using arithmetics.
π§ create a pointer to an int
, print it out, create a new pointer
that points to p + 1
. See what is the difference between the 2 pointers.
π ptr-diff.c
*
has bigger precedence than+
so:
i = *p + 1;
is equal to
i = (*p) + 1;
- postfix
++
has higher precedence than*
:
i = *p++;
is evaluated as *(p++)
but it still does this
i = *p; p++;
- because
++
is used in postfix mode, ie. the value of expressionp++
isp
:
Part of the standard since C90.
fopen
opens a file and returns an opaque handle (pointer)- getting
NULL
means an error - the
mode
argument controls the behavior: read, write, append- the
+
adds the other mode (write for read and vice versa, read for append)
- the
- write mode creates the file if it does not exist
- the
b
binary mode usually does not have any effect (see the standard)
- getting
fclose
closes the handle- important to avoid resource leak (
fopen
can allocate both memory and file descriptor)
- important to avoid resource leak (
freopen
can be used to associate the standard streams (stderr
,stdin
, orstdout
) with a file
π§ write a code that opens the same file in an cycle (until fopen()
fails) without calling fclose()
on the handle. After how many iterations does
it fail on your system?
π fopen-leak.c
fprintf
-printf
to a streamfscanf
- basically parses text input from a stream according to format string
- except the format string all the parameters must be pointers
fputs
/fgets
- send/read string to/from a streamfputc
/fgetc
- send/readchar
to/from a streamfwrite
/fread
- for writing/reading binary data (such as structures or raw numeric types)
fread
() reads a selected number of items of a given size to memory. We can
use either an array or we can directly read to a variable through an operator
address-of. In our case, we will be reading a file byte by byte, so we can give
fread
() just an address of a character variable.
char c;
FILE *fp;
/* Choose any other file you have on your system. */
if ((fp = fopen("/etc/passwd", "r")) == NULL)
err(1, "fopen");
/*
* fread() returns a number of *items* read. In our case, it's the same
* as number of bytes as we read it one byte at a time.
*/
while (fread(&c, sizeof (c), 1, fp) == 1) {
putchar(c);
}
fclose(fp);
π read-file.c
π§ Note that you could read more characters at a time. However, keep in mind the 2nd argument is size of the element read, and the 3rd argument is how many elements we read in one call. For example:
char a[16];
...
while ((n = fread(a, 1, sizeof (a), fp)) > 0) {
/* process the bytes here */
/* if we read less than requested, we hit end of file */
if (n < sizeof (a))
break;
}
Check the solution here: :key: read-file2.c
Also check manual page for fread
() and ignore for now that the 1st argument is
of type void *
, we will get there later. As mentioned above, you can safely
put there an array or an address of a variable.
π§ Check the man page for fwrite
() and modify the code so that what is
read from the file you write to some other file. Do not remember to open the
output file for writing. All the details are in the man page.
When reading/writing to a file using the above function, the current position changes accordingly. However, the position can be manipulated without performing any I/O.
fseek
- moves the position- the
whence
parameter has 3 possible values and makes theoffset
parameter relative to:SEEK_SET
- the beginning of the fileSEEK_END
- the end of the fileSEEK_CUR
- the current location of the cursor in the file
- the
ftell
- get current position in the file
- not defined by any standard however handy
- present in BSD, glibc (= Linux distros), Solaris, macOS, etc.
- use the
<err.h>
include, see the err(3) man page - this is especially handy when working with I/O
- instead of writing:
if (error) {
fprintf(stderr, "error occured: %s\n", strerror(errno));
exit(1);
}
- write the following
if (error)
err(1, "error occured: ");
- notice that a newline is inserted automatically at the end of the error message.
- or for functions that do not modify the
errno
value, use the x variant:
if (some_error)
errx(1, "ERROR: %d", some_error);
- there's also
warn()
/warnx()
that do not exit the program
Note that home assignments are entirely voluntary but writing code is the only way to learn a programming language.
π§ get a file size using the standard IO API (that is, lseek(2)
is
prohibited even if you know it).
π§ create array of int
values (of arbitrary positive length with values
ranging from INT_MAX
to 0
), write the array to a file, read the values
into another array and print them to the standard error.
Between the writing and reading the file handle has to remain open. Use the
same file handle for reading and writing.
Use od(1)
to verify the content of the file (thus it is handy to start with
INT_MAX
and e.g. divide by 2 for each successive value).
π fopen-binary.c
π§ use the file created by the previous program. Read the values from the end of the file to the beginning of the file one by one without knowing the file size and print the numbers to the standard error output.
π§ create a text file where each line begins with integer followed by space and a string, e.g.:
42 towel
13 dwarfs and Snow White
Read the file using fscanf()
and print the values (i.e. integer and a string)
from each line to standard output.