## Overview

This notebook will show you how to create and query a table or DataFrame that you uploaded to DBFS. [DBFS](https://docs.databricks.com/user-guide/dbfs-databricks-file-system.html) is a Databricks File System that allows you to store data for querying inside of Databricks. This notebook assumes that you have a file already inside of DBFS that you would like to read from.

This notebook is written in **Python** so the default cell type is Python. However, you can use different languages by using the `%LANGUAGE` syntax. Python, Scala, SQL, and R are all supported.

In [0]:
# File location and type
file_location = "/FileStore/tables/airports.csv"
file_type = "csv"

# CSV options
infer_schema = "false"
first_row_is_header = "true"
delimiter = ","

# The applied options are for CSV files. For other file types, these will be ignored.
df = spark.read.format(file_type) \
  .option("inferSchema", infer_schema) \
  .option("header", first_row_is_header) \
  .option("sep", delimiter) \
  .load(file_location)

display(df)

faa,name,lat,lon,alt,tz,dst
04G,Lansdowne Airport,41.1304722,-80.6195833,1044,-5,A
06A,Moton Field Municipal Airport,32.4605722,-85.6800278,264,-5,A
06C,Schaumburg Regional,41.9893408,-88.1012428,801,-6,A
06N,Randall Airport,41.431912,-74.3915611,523,-5,A
09J,Jekyll Island Airport,31.0744722,-81.4277778,11,-4,A
0A9,Elizabethton Municipal Airport,36.3712222,-82.1734167,1593,-4,A
0G6,Williams County Airport,41.4673056,-84.5067778,730,-5,A
0G7,Finger Lakes Regional Airport,42.8835647,-76.7812318,492,-5,A
0P2,Shoestring Aviation Airfield,39.7948244,-76.6471914,1000,-5,U
0S9,Jefferson County Intl,48.0538086,-122.8106436,108,-8,A


SparkSession has a .read attribute which has several methods for reading different data sources into Spark DataFrames. Using these a DataFrame can be created from a .csv file just like with regular pandas DataFrames!

The variable file_path is a string with the path to the file airports.csv. This file contains information about different airports all over the world.

In [0]:
# Don't change this file path
file_path = "/FileStore/tables/airports.csv"

# Read in the airports data
airports = spark.read.csv(file_path, header=True)

# Show the data
airports.show()