# Introduction to Databricks File System Utilities

In this demo, we’ll explore the Databricks file system utilities. Databricks includes a built-in utility called **dbutils** that simplifies common tasks within notebooks.

For now, we’re focusing solely on the file system operations. So, let’s click on **File System Utility** and see it in action!

Databricks’ file system utilities provide a set of commands to manage files and directories in the Databricks File System (DBFS) and other supported storage systems. For example, you can:

- Copy files from one directory to another
- Return the contents of a file as a UTF-8 encoded string
- List the contents of a directory
- Remove a file or directory contents

Let’s head over to Databricks and see why this is useful.

## Databricks File System Overview

Here is our Databricks File System under Catalog. We have the **FileSystem** directory and a **tables** subdirectory. Although you can view the contents via the UI, you can also list them using the Databricks file system utility.

### Listing the Root Path

All you do is provide a path; I'll provide the root path:

In [None]:
# List the contents of the root directory in DBFS
dbutils.fs.ls("/")

The output might not be super clear, but it appears that we have a **FileStore** directory as well as **databricks-datasets** and **databricks-results**.

The last two directories are not visible in the DBFS browser. These are special system directories provided by Databricks:

- **dbfs:/databricks-datasets/** is a read-only mount that offers a collection of sample datasets for learning and testing.
- **dbfs:/databricks-results/** is used by Databricks to store outputs, logs, or results from jobs and interactive queries.

Let's view the contents of the **databricks-datasets** directory:

In [None]:
# List the contents of the databricks-datasets directory
dbutils.fs.ls("dbfs:/databricks-datasets/")

For a nicer output, we can embed this in the `display` function:

In [None]:
# Display the contents of databricks-datasets with better formatting
display(dbutils.fs.ls("dbfs:/databricks-datasets/"))

### Displaying File Contents with `head`

Let's now use the `head` method to display one of the files. We'll display the README markdown file from the datasets:

In [None]:
# Show the first few bytes of the README.md file
dbutils.fs.head("dbfs:/databricks-datasets/README.md")

The raw data of the file is now displayed above.

### Copying a File

Now, let me show you how to copy a file from one location to another. We'll copy the README markdown file and store it in our **FileStore** directory.

The `cp` method takes two arguments: the source path and the target path.

In [None]:
# Copy the README.md file from databricks-datasets to FileStore
dbutils.fs.cp("dbfs:/databricks-datasets/README.md", "dbfs:/FileStore/")

Let's check the contents of the **FileStore** to confirm the copy:

In [None]:
# Display the contents of FileStore
display(dbutils.fs.ls("dbfs:/FileStore/"))

We have indeed copied the file from **databricks-datasets** to **FileStore**.

### Deleting a File

To delete the file, we can use the `rm` method:

In [None]:
# Remove the file from FileStore (adjust the path if needed)
dbutils.fs.rm("dbfs:/FileStore/")

### Copying an Entire Directory

First, let's view the contents of the **databricks-datasets** directory again:

In [None]:
display(dbutils.fs.ls("dbfs:/databricks-datasets/"))

Now, I'll copy the **weather** directory from **databricks-datasets** to a folder in **FileStore**.

In [None]:
# Copy the weather directory recursively
dbutils.fs.cp("dbfs:/databricks-datasets/weather/", "dbfs:/FileStore/weather", True)

Let's verify the copied directory in **FileStore**:

In [None]:
display(dbutils.fs.ls("dbfs:/FileStore/"))

### Deleting a Directory

To delete a folder that contains files, you need to use the recursive flag. Simply calling `dbutils.fs.rm("dbfs:/FileStore/weather")` won't work unless you specify recursion.

Let's try deleting without recursion first:

In [None]:
# This may not delete the folder if it contains files
dbutils.fs.rm("dbfs:/FileStore/weather")

Now, let's delete the directory recursively by setting the second argument to `True`:

In [None]:
# Delete the weather directory and its contents recursively
dbutils.fs.rm("dbfs:/FileStore/weather", True)

## Conclusion

That concludes our demo of the Databricks File System Utilities. These utilities allow you to manage and interact with files and directories in DBFS and mounted storage directly within your notebooks—letting you list, read, write, copy, move, and delete files without leaving the interactive environment.