# 2. Getting Data Into Databricks

Every data journey starts with ingestion. Databricks is an open platform that provides multiple ways to load your data, from simple point-and-click interfaces to automated, enterprise-grade tools. Let's explore three common methods.

## Method 1: Point-and-Click with Lakeflow Connect

The easiest way to connect to hundreds of data sources is using the built-in connectors. This UI-driven approach is perfect for quickly loading data from external databases like MySQL, Postgres, Salesforce, and many more without writing code.

The video below provides a fantastic overview of how to connect to sources using the UI.

[![Video Thumbnail](https://img.youtube.com/vi/7uHLVQSHVAw/0.jpg)](https://www.youtube.com/watch?v=7uHLVQSHVAw&t=1s)

### To Try It Yourself:
1. In the left navigation bar, click **+ New** > **Add data**.
2. Click **Lakeflow Connect** to browse available sources.

📖 **Resource:** [Lakeflow Connect Documentation](https://docs.databricks.com/en/connect/index.html)

## Method 2: Partner Connect with Fivetran & Others

Databricks partners with leading data integration companies like Fivetran, Airbyte, and Informatica. These partners provide pre-built, managed connectors for hundreds of SaaS applications (e.g., Google Analytics, Stripe, HubSpot).

**Partner Connect** is a feature in the Databricks UI that simplifies the process of connecting these tools to your workspace.

### To Explore:
1. In the left navigation bar, go to **Partner Connect**.
2. Find a data integration partner like **Fivetran** and follow the on-screen steps to connect.

📖 **Resource:** [Partner Connect Documentation](https://docs.databricks.com/en/partner-connect/index.html)

## Method 3: Automated Ingestion with Auto Loader

For data landing in cloud storage (S3, ADLS, GCS), **Auto Loader** is the most powerful and efficient tool. It automatically and incrementally processes new files as they arrive, handling schema changes and scalability for you. This is the recommended best practice for file-based ingestion.

Your setup script installed the `auto-loader` demo, which provides a hands-on example.

### To See it in Action:

Navigate to the `auto-loader` demo folder in your workspace and open the **`01-Auto-Loader-and-Schema-Evolution`** notebook. This notebook shows how you can easily ingest files with just a few lines of SQL or Python.

[![Video Thumbnail](https://img.youtube.com/vi/2F6mBvLoavs/0.jpg)](https://www.youtube.com/watch?v=2F6mBvLoavs&t=1s)

### 📖 Additional Resources:
* [Auto Loader Documentation](https://docs.databricks.com/en/ingestion/auto-loader/index.html)

## Method 4: Code-Based Ingestion Examples

For developers who prefer programmatic approaches, here are code examples for common ingestion patterns:

### Reading from Cloud Storage (S3, ADLS, GCS)

In [0]:
# Example: Reading CSV files from cloud storage
df = spark.read \
    .option("header", "true") \
    .option("inferSchema", "true") \
    .csv("s3a://your-bucket/path/to/files/*.csv")

# Save to Delta table
df.write \
    .format("delta") \
    .mode("overwrite") \
    .saveAsTable("main.default.your_table_name")

### Auto Loader with SQL (Recommended for Production)

In [0]:
%sql
-- Auto Loader example: Automatically ingest new files as they arrive
CREATE OR REFRESH STREAMING LIVE TABLE raw_data
AS SELECT * FROM cloud_files(
  "s3://your-bucket/incoming-data/", 
  "json",
  map("cloudFiles.inferColumnTypes", "true")
)

## Comprehensive Resource Library

### 📚 **Official Documentation**
* [Data Ingestion on Databricks - Complete Guide](https://docs.databricks.com/en/ingestion/index.html)
* [Auto Loader Deep Dive](https://docs.databricks.com/en/ingestion/auto-loader/index.html)
* [Partner Connect Documentation](https://docs.databricks.com/en/partner-connect/index.html)
* [Lakehouse Connect (Data Connectors)](https://docs.databricks.com/en/connect/index.html)
* [File Upload and Data Import](https://docs.databricks.com/en/ingestion/add-data/index.html)

### 🛠️ **Hands-On Tutorials**
* [Data Engineering with Databricks Course](https://www.databricks.com/learn/training/data-engineering-courses)

### 📖 **Advanced Reading**
* [Schema Evolution in Auto Loader](https://docs.databricks.com/en/ingestion/auto-loader/schema.html)
* [Real-time Streaming Ingestion Patterns](https://docs.databricks.com/en/structured-streaming/index.html)

### 🏗️ **Architecture Patterns**
* [Medallion Architecture for Data Ingestion](https://www.databricks.com/glossary/medallion-architecture)

### 💡 **Community Resources**
* [Reddit - r/databricks](https://www.reddit.com/r/databricks/)