## Overview

This notebook will show you how to create and query a table or DataFrame that you uploaded to DBFS. [DBFS](https://docs.databricks.com/user-guide/dbfs-databricks-file-system.html) is a Databricks File System that allows you to store data for querying inside of Databricks. This notebook assumes that you have a file already inside of DBFS that you would like to read from.

This notebook is written in **Python** so the default cell type is Python. However, you can use different languages by using the `%LANGUAGE` syntax. Python, Scala, SQL, and R are all supported.

In [0]:
# File location and type
file_location = "/FileStore/tables/CogsleyServices_SalesData_US.csv"
file_type = "csv"

# CSV options
infer_schema = "false"
first_row_is_header = "true"
delimiter = ","

# The applied options are for CSV files. For other file types, these will be ignored.
df = spark.read.format(file_type) \
  .option("inferSchema", infer_schema) \
  .option("header", first_row_is_header) \
  .option("sep", delimiter) \
  .load(file_location)

In [0]:
# Create a view or table

temp_table_name = "CogsleyServices_SalesData_US_csv"

df.createOrReplaceTempView(temp_table_name)

In [0]:
%sql

/* Query the created temp table in a SQL cell */

SELECT * 
FROM `CogsleyServices_SalesData_US_csv`
LIMIT 10

RowID,OrderID,OrderDate,OrderMonthYear,Quantity,Quote,DiscountPct,Rate,SaleAmount,CustomerName,CompanyName,Sector,Industry,City,ZipCode,State,Region,ProjectCompleteDate,DaystoComplete,ProductKey,ProductCategory,ProductSubCategory,Consultant,Manager,HourlyWage,RowCount,WageMargin
1914,13729,2009-01-01,2009-01-01,9,1800,0.08,200,1640.96,Matt Bertelsons,The Priceline Group Inc.,Miscellaneous,Business Services,Bowie,20715,Maryland,East,2009-01-03,2,Development - Big Data,Development,Python,Noah Smith,Allen Young,59,1,0.71
4031,28774,2009-01-01,2009-01-01,32,6400,0.1,200,5707.67,Jessica Thornton,Garmin Ltd.,Capital Goods,Industrial Machinery/Components,McKeesport,15131,Pennsylvania,East,2009-01-02,1,Development - Big Data,Development,Market Research,Daniel Tusk,Allen Young,45,1,0.78
1279,9285,2009-01-02,2009-01-01,3,480,0.06,160,447.11,David O'Rourke,Wynn Resorts Limited,Consumer Services,Hotels/Resorts,Prior Lake,55372,Minnesota,Central,2009-01-04,2,Development - Java,Development,Python,Mason Gibson,Josh Martinez,71,1,0.56
5272,37537,2009-01-02,2009-01-01,4,500,0.0,125,495.47,Alan Brumley,Bed Bath & Beyond Inc.,Consumer Services,Home Furnishings,Napa,94559,California,West,2009-01-02,0,Training - Development,Training,Java,William Bufont,Bob Turner,62,1,0.5
5273,37537,2009-01-02,2009-01-01,43,5375,0.07,125,4953.46,Alan Brumley,Bed Bath & Beyond Inc.,Consumer Services,Home Furnishings,Napa,94559,California,West,2009-01-04,2,Training - Development,Training,Strategy,Liam Franklin,Bob Turner,52,1,0.58
5274,37537,2009-01-02,2009-01-01,32,6400,0.05,200,6024.92,Alan Brumley,Bed Bath & Beyond Inc.,Consumer Services,Home Furnishings,Napa,94559,California,West,2009-01-09,7,Development - Big Data,Development,.Net,Emma Watson,Bob Turner,67,1,0.67
6224,44069,2009-01-02,2009-01-01,16,1760,0.09,110,1587.09,Elizabeth Hansen,Fastenal Company,Consumer Services,RETAIL: Building Materials,Montebello,90640,California,West,2009-01-04,2,Development - Python,Development,Business Model,Sophia Dixon,Bob Turner,71,1,0.35
6225,44069,2009-01-02,2009-01-01,43,4730,0.08,110,4312.18,Elizabeth Hansen,Fastenal Company,Consumer Services,RETAIL: Building Materials,Montebello,90640,California,West,2009-01-02,0,Development - Python,Development,SQL,Mia Moore,Bob Turner,51,1,0.54
1074,7909,2009-01-03,2009-01-01,29,3480,0.03,120,3345.1,Alex Grayson,C.H. Robinson Worldwide Inc.,Transportation,Oil Refining/Marketing,Lake Oswego,97035,Oregon,West,2009-01-04,1,Development - Business Logic,Development,Market Research,Abigail Young,Bob Turner,50,1,0.58
1315,9637,2009-01-03,2009-01-01,12,1800,0.08,150,1641.04,Andy Willingham,DIRECTV,Consumer Services,Telecommunications Equipment,Baton Rouge,70802,Louisiana,South,2009-01-05,2,Consulting - Business Model,Consulting,Java,Madison Hill,Frank Mitchell,58,1,0.61


In [0]:
# With this registered as a temp view, it will only be available to this particular notebook. If you'd like other users to be able to query this table, you can also create a table from the DataFrame.
# Once saved, this table will persist across cluster restarts as well as allow various users across different notebooks to query this data.
# To do so, choose your table name and uncomment the bottom line.

permanent_table_name = "CogsleyServices_SalesData_US_csv"

df.write.format("parquet").saveAsTable(permanent_table_name)