# 1. Introduction to Databricks Notebooks

### What is a Databricks Notebook?

A **Databricks Notebook** is an interactive workspace where you can write, run, and visualize code for data engineering, analytics, and machine learning — all in one place.

Databricks notebooks support multiple languages and are deeply integrated with **Apache Spark**, **Delta Lake**, and **Unity Catalog**.

They are commonly used for:
- Data exploration and transformation  
- Building ETL pipelines  
- Running analytics and SQL queries  
- Developing machine learning workflows  
- Collaboration and documentation  

### Supported Languages

A single Databricks workspace supports multiple notebook languages:

- **Python (PySpark)**
- **SQL**
- **Scala**
- **R**

Each notebook has a **default language**, but you can mix languages using magic commands.

# 2. Attach a Notebook to a Cluster

Before running any code, a notebook must be attached to a **cluster**, which provides the compute resources.

**Steps:**

1. At the top of the notebook, click the cluster dropdown.
2. Select an existing cluster.
3. Click **Start** and wait until the cluster status shows a green indicator.

> **Note:** Cluster startup time depends on cluster size and configuration.

# 3. Default Notebook Language

Each notebook has a **default language** (Python, SQL, Scala, or R).

* The default language is shown at the top-right of the notebook.
* You can change it at any time.

In this notebook, we will use **multiple languages together**.

# 4. Notebook Basics
Databricks notebooks execute code **cell by cell**. Each cell can contain code or text.

### Cell Types
- **Code Cell** – Executes code  
- **Markdown Cell** – Used for documentation and explanations  

### Running Cells
- Run a single cell: `Shift + Enter`  
- Run all cells: **Run → Run All**  
- Stop execution: **Cancel** button  

In [0]:
print("Hello World")

# 5.Basic Markdown
Markdown cells are used to **document, explain, and structure** your notebook.
They make notebooks easier to read, understand, and share.


# Heading 1 (Title)
## Heading 2 (Section)
### Heading 3 (Subsection)

---

**Bold text**

*Italic text*

`inline code`

This notebook explains **Delta Lake** using `Spark SQL`.

---

Ordered list
1. first
2. second
3. third


Unordered list
* Coffee
* Bread
* Milk

---

Tables:
| user_id | user_name|
|---------|----------|
|     1   |  Danny   |
|     2   |  Nancy   |
|     3   |  James   |

---

Markdown Link : [Databricks Documentation](https://docs.databricks.com)

HTML Link : <a href="https://docs.databricks.com" target="_blank">  Databricks Documentation </a>

Images:
![Associate-Badge](https://www.databricks.com/wp-content/uploads/2022/04/associate-badge-eng.svg)




# 6. Multi-Language Support with Magic Commands

Databricks supports **magic commands**, which allow you to override the notebook’s default language **per cell**.

Supported language magics:

* `%python`
* `%sql`
* `%scala`
* `%r`

In [0]:
%sql
select 'Hello World'

In [0]:
%scala
println("Hello, World")

In [0]:
%r
print("Hello, World!")

### Magic Command - `%run` 

The `%run` magic command is used to **execute another notebook**
inside the current notebook.

It allows you to **reuse code**, **share logic**, and **build modular notebooks**.

In [0]:
%run ./Databricks_another_notebook

In [0]:
print(Greeting)

### Magic Command - `%sh`

The `%sh` magic command allows you to execute **Linux shell commands** directly on the **Spark driver node** in a Databricks notebook.
It is commonly used for basic system checks, file inspection, and package installation during development

In [0]:
%sh
ls -l

### Magic Command - `%fs` 
The `%fs` magic command is used to interact with the **Databricks File System (DBFS)**  
and cloud storage mounted to Databricks.

It is similar to Linux file system commands such as `ls`, `cp`, and `rm`.

In [0]:
%fs ls /

In [0]:
%fs ls dbfs:/databricks-datasets/

In [0]:
%fs head dbfs:/databricks-datasets/SPARK_README.md

### `dbutils` Overview

`dbutils` is a Databricks utility library that helps you interact with
the Databricks environment **outside of Spark processing**.

It is commonly used for:
- File system operations
- Secrets management
- Notebook workflows
- Parameterized execution
- Environment utilities


In [0]:
dbutils.fs.ls("/")

In [0]:
Files = dbutils.fs.ls("/")
display(Files)

In [0]:
dbutils.fs.head("/databricks-datasets/README.md")

In [0]:
content = dbutils.fs.head("/databricks-datasets/README.md")

lines = content.split("\n")

print("Number of lines:", len(lines))
print("First line:", lines[0])


In [0]:
dbutils.help()

In [0]:
dbutils.fs.help()

In [0]:
dbutils.fs.mkdirs('/temp')
dbutils.fs.ls('/')

In [0]:
dbutils.fs.rm('/temp')

In [0]:
dbutils.fs.ls('/')