# 1 Introduction and flat files

In this chapter, you'll learn how to import data into Python from all types of flat files, which are a simple and prevalent form of data storage. You've previously learned how to use NumPy and pandas—you will learn how to use these packages to import flat files and customize your imports.

## 1.1 Welcome to the course!

## 1.2 Exploring your working directory

In order to import data into Python, you should first have an idea of what files are in your working directory.

IPython, which is running on DataCamp's servers, has a bunch of cool commands, including its [magic commands](https://ipython.readthedocs.io/en/stable/overview.html). For example, starting a line with `!` gives you complete system shell access. This means that the IPython magic command `! ls` will display the contents of your current directory. Your task is to use the IPython magic command `! ls` to check out the contents of your current directory and answer the following question: which of the following files is in your working directory?

### 1.2.1 Instructions

### Possible Answers

- [ ] `huck_finn.txt`


- [ ] `titanic.csv`


- [x] `moby_dick.txt`

## 1.3 Importing entire text files

In this exercise, you'll be working with the file `moby_dick.txt`. It is a text file that contains the opening sentences of Moby Dick, one of the great American novels! Here you'll get experience opening a text file, printing its contents to the shell and, finally, closing it.

### 1.3.1 Instructions

- Open the file `moby_dick.txt` as *read-only* and store it in the variable `file`. Make sure to pass the filename enclosed in quotation marks `''`.
- Print the contents of the file to the shell using the `print()` function. As Hugo showed in the video, you'll need to apply the method `read()` to the object `file`.
- Check whether the file is closed by executing `print(file.closed)`.
- Close the file using the `close()` method.
- Check again that the file is closed as you did above.

## 1.4 Importing text files line by line

For large files, we may not want to print all of their content to the shell: you may wish to print only the first few lines. Enter the `readline()` method, which allows you to do this. When a file called `file` is open, you can print out the first line by executing `file.readline()`. If you execute the same command again, the second line will print, and so on.

In the introductory video, Hugo also introduced the concept of a **context manager**. He showed that you can bind a variable `file` by using a context manager construct:

```python
with open('huck_finn.txt') as file:
```

While still within this construct, the variable `file` will be bound to `open('huck_finn.txt')`; thus, to print the file to the shell, all the code you need to execute is:

```python
with open('huck_finn.txt') as file:
    print(file.readline())
```

You'll now use these tools to print the first few lines of `moby_dick.txt`!

### 1.4.1 Instructions

- Open `moby_dick.txt` using the `with` context manager and the variable `file`.
- Print the first three lines of the file to the shell by using `readline()` three times within the context manager.

## 1.5 The importance of flat files in data science

## 1.6 Pop quiz: examples of flat files

You're now well-versed in importing text files and you're about to become a wiz at importing flat files. But can you remember exactly what a flat file is? Test your knowledge by answering the following question: which of these file types below is NOT an example of a flat file?

### 1.6.1 Answer the question

### Possible Answers

- [ ] A .csv file.

- [ ] A tab-delimited .txt.

- [x] A relational database (e.g. PostgreSQL).

## 1.7 Pop quiz: what exactly are flat files?

Which of the following statements about flat files is incorrect?

### 1.7.1 Answer the question

### Possible Answers

- [ ] Flat files consist of rows and each row is called a record.

- [x] Flat files consist of multiple tables with structured relationships between the tables.

- [ ] A record in a flat file is composed of *fields* or *attributes*, each of which contains at most one item of information.

- [ ] Flat files are pervasive in data science.