### Bronze Layer Data Ingestion Summary

- Loaded raw CSV data for **students**, **courses**, **enrollments** and **results** 
- Ingested each dataset into Spark DataFrames with schema inference and header recognition.
- Persisted each DataFrame as a **Delta table** in the `kusha_solutions.Jeevan` schema, following the bronze layer convention:
  - `bronze_students`
  - `bronze_courses`
  - `bronze_enrollments`
  - `bronze_results`
- Ensured overwrite mode and schema consistency for repeatable, reliable ingestion.
- Verified data and schema for each bronze table using interactive display and schema printout.

In [0]:
# Set the schema (database) name
schema_name = "kusha_solutions.Jeevan"

# Define the path to the students CSV file in DBFS
students_csv_path  = "/Volumes/kusha_solutions/jeevan/my_volume/csv_data/raw/students/students.csv"

# Read the students CSV file into a Spark DataFrame with header and schema inference
students_df = spark.read.option("header", True).option("inferSchema", True).csv(students_csv_path)

# Write the DataFrame to a Delta table in overwrite mode
students_df.write.format("delta").mode("overwrite").option("overwriteSchema", "true").saveAsTable(f"{schema_name}.bronze_students")

# Print completion message
print("done")

In [0]:
# Set the table name for the bronze_students Delta table in the specified schema
table_name = "kusha_solutions.Jeevan.bronze_students"

# Load the bronze_students table into a Spark DataFrame
students_df = spark.table(table_name)

# Display the contents of the students DataFrame in a Databricks interactive table
display(students_df)

# Print the schema of the students DataFrame to show column names and data types
students_df.printSchema()

In [0]:
# Set the schema (database) name
schema_name = "kusha_solutions.Jeevan"

# Define the path to the courses CSV file in DBFS
courses_csv_path = "/Volumes/kusha_solutions/jeevan/my_volume/csv_data/raw/courses/courses.csv"

# Read the courses CSV file into a Spark DataFrame with header and schema inference
courses_df = spark.read.option("header", True).option("inferSchema", True).csv(courses_csv_path)

# Write the DataFrame to a Delta table in overwrite mode
courses_df.write.format("delta").mode("overwrite").saveAsTable(f"{schema_name}.bronze_courses")

# Print completion message
print("✅ ")

In [0]:
# Set the table name for the bronze_courses Delta table in the specified schema
table_name = "kusha_solutions.Jeevan.bronze_courses"

# Load the bronze_courses table into a Spark DataFrame
courses_df = spark.table(table_name)

# Display the contents of the courses DataFrame in a Databricks interactive table
display(courses_df)

# Print the schema of the courses DataFrame to show column names and data types
courses_df.printSchema()

In [0]:
# Set the schema (database) name
schema_name = "kusha_solutions.Jeevan"

# Define the path to the enrollments CSV file in DBFS
enrollments_csv_path = "/Volumes/kusha_solutions/jeevan/my_volume/csv_data/raw/enrollments/enrollments.csv"

# Read the enrollments CSV file into a Spark DataFrame with header and schema inference
enrollments_df = spark.read.option("header", True).option("inferSchema", True).csv(enrollments_csv_path)

# Write the DataFrame to a Delta table in overwrite mode
enrollments_df.write.format("delta").mode("overwrite").saveAsTable(f"{schema_name}.bronze_enrollments")

# Print completion message
print("✅")

In [0]:
# Specify the table name for the bronze_enrollments Delta table in the kusha_solutions.Jeevan schema
table_name = "kusha_solutions.Jeevan.bronze_enrollments"

# Load the bronze_enrollments table into a Spark DataFrame
enrollments_df = spark.table(table_name)

# Show the contents of the enrollments DataFrame in the Databricks interactive display
display(enrollments_df)

# Output the schema of the enrollments DataFrame to display column names and data types
enrollments_df.printSchema()

In [0]:
# Set the schema (database) name
schema_name = "kusha_solutions.Jeevan"

# Define the path to the results CSV file in DBFS
results_csv_path     = "/Volumes/kusha_solutions/jeevan/my_volume/csv_data/raw/results/results.csv"

# Read the results CSV file into a Spark DataFrame with header and schema inference
results_df = spark.read.option("header", True).option("inferSchema", True).csv(results_csv_path)

# Write the DataFrame to a Delta table in overwrite mode
results_df.write.format("delta").mode("overwrite").saveAsTable(f"{schema_name}.bronze_results")

# Print completion message
print("✅ ")

In [0]:
# Specify the Delta table name for results data in the kusha_solutions.Jeevan schema
table_name = "kusha_solutions.Jeevan.bronze_results"

# Load the bronze_results Delta table into a Spark DataFrame
results_df = spark.table(table_name)

# Display the contents of the results DataFrame in Databricks interactive table
display(results_df)

# Print the schema of the results DataFrame to show column names and data types
results_df.printSchema()