# Reading Data - Tables and Views

**Technical Accomplishments:**
* Demonstrate how to pre-register data sources in Microsoft Fabric
* Introduce temporary views over files.
* Read data from tables/views.
* Regarding `printRecordsPerPartition(..)`, it 
  * converts the specified `DataFrame` to an RDD
  * counts the number of records in each partition
  * prints the results to the console.

## ➡️ Getting Started

Run the following cell to configure our notebook.

In [None]:
%run Utilities

## ➡️ Upload a CSV file to the Lakehouse

1. Download the "Taxi Zone Lookup Table" CSV file from [here](https://github.com/weslbo/DP-601/raw/main/data/taxi_zone_lookup.csv), and save to a location in your computer.

1. Create the `TaxiData` folder under the Files section of your Lakehouse.

1. Upload the file to the folder, by using the Upload file item in the folder contextual menu.

1. Once uploaded, select the folder to see its content.

## ➡️ Load the file to a Delta table

1. Right-click or use the ellipsis on the CSV file to access the contextual menu. Select Load to Tables and choose the New table option.

1. The load to tables user interface shows up with the suggested table name. Real time validations on special characters apply during typing.

1. Select Load to execute the load.

1. The table now shows up in the lakehouse explorer, expand the table to see the columns and its types. Select the table to see a preview.

## ➡️ Reading from a Table/View

We can now read in the "table" **taxi_zone_lookup** as a `DataFrame` with one simple command (and then print the schema):

In [None]:
taxi_zone_lookup_DF = spark.read.table("taxi_zone_lookup")
taxi_zone_lookup_DF.printSchema()

And of course we can now view that data as well:

In [None]:
display(taxi_zone_lookup_DF)

Let's take a look at some of the other details of the `DataFrame` we just created for comparison sake.

In [None]:
print("Partitions: " + str(taxi_zone_lookup_DF.rdd.getNumPartitions()))
print("-"*80)

## ➡️ Displaying data

Tables that are loadable by the call `spark.read.table(..)` are also accessible through the SQL APIs.

For example, we already used Microsoft Fabric to expose **taxi_zone_lookup** as a table/view.

In [None]:
%%sql

select * from taxi_zone_lookup limit(5)