minishell

The aim of this project is to create a simple shell, to learn about processes, file descriptors and pipes. Inspired by the "42 Coding School" exercise "minishell" (January 2022).

Introduction

Next: Approach [Contents]

Allowed functions

readline, rl_clear_history, rl_on_new_line, rl_replace_line, rl_redisplay, add_history, printf, malloc, free, write, access, open, read, close, fork, wait, waitpid, wait3, wait4, signal, sigaction, kill, exit, getcwd, chdir, stat, lstat, fstat, unlink, execve, dup, dup2, pipe, opendir, readdir, closedir, strerror, perror, isatty, ttyname, ttyslot, ioctl, getenv, tcsetattr, tcgetattr, tgetent, tgetflag, tgetnum, tgetstr, tgoto, tputs

Description

The aim of the exercise is to write a shell. Bash can be taken as a reference. About the shell:

It should not interpret unclosed quotes or unspecified special characters like '\' or ';'.
It should not use more than one global variable.
It should show a prompt when waiting for a new command.
It should have a working History.
Search and launch the right executable (with relative, absolute or without path).
Environment variables should expand to their values.
"$?" should expand to the exit status of the most recently executed foreground pipeline.
Singlequoates inhibit all interpretation of a sequence of characters.
Doublequotes inhibit all interpretation of a sequence of characters except for environment variables.
The signals ctrl+c, ctrl+d and ctrl+\ should work like in bash.
'|' pipes the output of a command to the input of the next command.
'>' should redirect output.
'<' should redirect input.
">>" should redirect output with append mode.
"<<" should redirect input like a basic "heredoc". It doesn’t need to update history or handle expansion.

The following builtins have to be implemented:

echo with option -n
cd
pwd with no options
export with no options
unset with no options
env with no options or arguments
exit with no options

Approach

Next: Prerequisites Previous: Introduction [Contents]

My approach was to first get an understanding of how the original bash works so that I can reimplement it. For this the Bash Reference Manual was the most important resource when I had to look up things, I didn't know how they were supposed to work.Only reading the table of contend already gives an idea of the sequence in which things happen and gives a nice overview of its single elements.

The Definitions were super useful. Here you learn what metacharacters, words, operators and tokens are.

Some of the most important chapters of the manual for this exercise are: Quoting, Pipes, Redirections, Executing Commands and Builtins.

In a nutshell when my minishell (which will not be as complex as bash) reads and executes input from the command line it has to do something like this:

Split the input up into tokens (words and operators), via the metacharacters, considering the quoting rules.
Perform expansions.
Parse the expanded tokens into commands.
Set up redirections, if necessary.
Execute the command(s).

From what I learned so far, I decided to devided the shell into three main parts: the lexer, the parser and the executor.

Lexer

The lexer (also called lexical analyzer or tokenizer) splits the input into a list of tokens, using the metacharacters and performs expansions and quoteremoval.

For example the commandline:

will give us the following token list:

Note how all the unquoted whitespaces disappeared.

Parser

The parser processes the tokens and creates a command table (a data structure that stores the commands that will be executed). This happens following the shell's grammar. The grammar is written in a format called Backus-Naur Form (or BNF) and looks like this. The grammar determines the structure of a language and is necessary to make sense out of a sequence of words that form a frase (or in our case the commandline input). For example you can bring the following words in order, so that they form a sentence, using English grammar: the, blue, is, table. The table is blue.

The Backus-Naur Form consists of a set of rules. Each rule has two parts: a name (you can look at it as a "building block") on the left and an the expansion of the name (you can look at it as a "blueprint" for the building block) on the right.

This is pretty confusing. For a better understanding, I tried to write down my own "grammar receips". I used my own words (so it might not be coherent with the original terms). I use squarebrackets [ ] to represent a "container", /(= "or") to seperate options and * to show that the container it is attached to is optional, but can also occure more than once. Below the rule I wrote down how I would read it out in English:

[simple command] = [executable [argument]*]

A simple command is an executable followed by 0 or more arguments.

[input redirection] = [<filename]

A input redirection is a input operator followed by a filename.

[output redirection] = [>filename]

A output redirection is a output operator followed by a filename.

[append redirection] = [>>filename]

A append redirection is a append operator followed by a filename.

[heredoc] = [<<delimiter]

A heredoc is a heredoc operator followed by a delimiter.

[redirection] = [[input redirection] / [output redirection] / [append redirection] / [heredoc]]

A redirection is an input redirection or an outout redirection or an append redirection or a heredoc.

[command] = [[simple command] [redirection]*]

A command can contain a simple command and 0 or more redirections. The order of the elements does not matter.

[pipeline] = [[command]] [ | [command]]*]

A pipeline consisting of a command and 0 or more pipe operators followed by another command.

So the minishell input will have the following format:

[executable [argument]*] [< filename]* [<< delimiter]* [> filename]* [>> filename]* [| [[executable [argument]*] [< filename]* [<< delimiter]* [> filename]* [>> filename]*]*

Command table

Knowing the grammar, creating the command table from the tokens is easy. Let's try it with our example token list from above:

First we search for the pipe operators, as they separate the commands.

This will give us the commands:

Command1: << END < /home/infile grep -v 42

Command2: outfile1 wc -l > outfile2

Command3: ls

Command4: > outfile3

Command5: echo don't | split

Within each command we then search for the redirection operators, because we know that the token following a redirection operator is the associated filename/delimiter.

Command1: ~~<<~~ ~~END~~ ~~<~~ ~~/home/infile~~ grep -v 42

Command2: ~~>>~~ ~~outfile1~~ wc -l ~~>~~ ~~outfile2~~

Command3: ls

Command4: ~~>~~ ~~outfile3~~

Command5: echo don't | split

So what's left is the executable and the arguments. As the arguments follow the executable we now know that the first token that is left is the executable and the rest are the arguments.

Command1: grep -v 42

Command2: wc -l

Command3: ls

Command4: -

Command5: echo don't | split

#	executable	list of arguments	list of stdin redirections	list of stdout redirections
Command1	`grep`	`-v` `42`	`<<END` `</home/infile`	-
Command2	`wc`	`-l`	-	`>>outfile1` `>outfile2`
Command3	`ls`	-	-	-
Command4	-	-	-	`>outfile3`
Command5	`echo`	`don't \| split`	-	-

Executor

The executor takes the command table generated by the parser and creates a new process for each command that is not a builtin. If necessary it will create pipes to forward the output of one process to the input of the next one and redirect the standard in- and output.

Creating the processes is necessary, because to execute the non-builtin executables, we will use the execve function, that basically overrides all following code by the code of the executable. This means after the first successful execve call the entire program would end. Calling it in a child process will only end that, while the parent process goes on.

How to fork a process

How to redirect

How to setup pipes

How to use the exec family functions

How to handle signals

Prerequisites

Next: How to launch Previous: Approach [Contents]

Tested on Ubuntu 20.04.3 LTS

gcc (sudo apt-get install gcc)
make (sudo apt-get install make)
readline (sudo apt-get install lib32readline8 lib32readline-dev)

How to launch

Next: Example Previous: Prerequisites [Contents]

Compile the program via the Makefile by using make in the root directory of the repository.

Run it like this:

./minishell

Example

Next: Resources Previous: How to launch [Contents]

Resources

Next: Notes Previous: Example [Contents]

Bash Reference Manual

Code Vault Playlist - Unix Processes in C

Notes

Previous: Resources [Contents]

The parser is used from a former teamproject and was coded by jzhou.

Please note that the external "readline" function can produce some memory leaks. For this exercise I did not care about them.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
inc		inc
libft		libft
src		src
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

minishell

Table of contents

Introduction

Next: Approach [Contents]

Allowed functions

Description

Approach

Next: Prerequisites Previous: Introduction [Contents]

Lexer

Parser

Command table

Executor

Prerequisites

Next: How to launch Previous: Approach [Contents]

How to launch

Next: Example Previous: Prerequisites [Contents]

Example

Next: Resources Previous: How to launch [Contents]

Resources

Next: Notes Previous: Example [Contents]

Notes

Previous: Resources [Contents]

About

Releases

Packages

Languages

aenglert42/minishell

Folders and files

Latest commit

History

Repository files navigation

minishell

Table of contents

Introduction

Next: Approach [Contents]

Allowed functions

Description

Approach

Next: Prerequisites Previous: Introduction [Contents]

Lexer

Parser

Command table

Executor

Prerequisites

Next: How to launch Previous: Approach [Contents]

How to launch

Next: Example Previous: Prerequisites [Contents]

Example

Next: Resources Previous: How to launch [Contents]

Resources

Next: Notes Previous: Example [Contents]

Notes

Previous: Resources [Contents]

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages