## Create Data Frame to bike details using StructType

Develop a fuction to apply schema while creating the Spark Data Frame using Spark StructType.
* Here are the details of the columns

|Column Name|Data Type|
|-----------|---------|
|name|string|
|selling_price|int|
|year|int|
|seller_type|string|
|owner|string|
|km_driven|int|

In [0]:
bike_details = [('Royal Enfield Classic 350', 175000, 2019, 'Individual', '1st owner', 350),
 ('Honda Dio', 45000, 2017, 'Individual', '1st owner', 5650),
 ('Royal Enfield Classic Gunmetal Grey', 150000, 2018, 'Individual', '1st owner', 12000),
 ('Yamaha Fazer FI V 2.0 [2016-2018]', 65000, 2015, 'Individual', '1st owner', 23000),
 ('Yamaha SZ [2013-2014]', 20000, 2011, 'Individual', '2nd owner', 21000),
 ('Honda CB Twister', 18000, 2010, 'Individual', '1st owner', 60000),
 ('Honda CB Hornet 160R', 78500, 2018, 'Individual', '1st owner', 17000),
 ('Royal Enfield Bullet 350 [2007-2011]', 180000, 2008, 'Individual', '2nd owner', 39000),
 ('Hero Honda CBZ extreme', 30000, 2010, 'Individual', '1st owner', 32000),
 ('Bajaj Discover 125', 50000, 2016, 'Individual', '1st owner', 42000)]

## Step 1: Preview the data
* Let us first preview the data.

In [0]:
attributes

In [0]:
type(attributes)

In [0]:
bike_details

In [0]:
type(bike_details)

In [0]:
bike_details[0]

In [0]:
type(bike_details[0])

## Step 2: Provide the solution
Now come up with the solution by developing the required logic. Once the logic is developed, go to the next step to take care of the validation.

In [0]:
# Your code should go here

from pyspark.sql.types import StructType, StructField, IntegerType, StringType

schema = StructType([
    StructField('name', StringType()),
    StructField('selling_price', StringType()),
    StructField('year', StringType()),
    StructField('seller_type', StringType()),
    StructField('owner', StringType()),
    StructField('km_driven', StringType())
])

bike_details_df = spark.createDataFrame(bike_details, schema=schema)

### Step 3: Validate the function
Here are the steps to validate the function.

In [0]:
display(bike_details_df)

name,selling_price,year,seller_type,owner,km_driven
Royal Enfield Classic 350,175000,2019,Individual,1st owner,350
Honda Dio,45000,2017,Individual,1st owner,5650
Royal Enfield Classic Gunmetal Grey,150000,2018,Individual,1st owner,12000
Yamaha Fazer FI V 2.0 [2016-2018],65000,2015,Individual,1st owner,23000
Yamaha SZ [2013-2014],20000,2011,Individual,2nd owner,21000
Honda CB Twister,18000,2010,Individual,1st owner,60000
Honda CB Hornet 160R,78500,2018,Individual,1st owner,17000
Royal Enfield Bullet 350 [2007-2011],180000,2008,Individual,2nd owner,39000
Hero Honda CBZ extreme,30000,2010,Individual,1st owner,32000
Bajaj Discover 125,50000,2016,Individual,1st owner,42000


In [0]:
bike_details_df.columns # ['name', 'selling_price', 'year', 'seller_type', 'owner', 'km_driven']

In [0]:
bike_details_df.count() # 10