### Unidad 1. Evidencia de Aprendizaje 1 - Creación de una base de datos analítica


**Objetivo general**

Analizar los factores que influyen en el precio y la valoración de los alojamientos publicados en Airbnb, con el fin de entender qué características hacen que un anuncio sea más exitoso o rentable dentro de la plataforma.

**Objetivos específicos**

- Examinar las principales características de los alojamientos, como ubicación, número de camas, baños y servicios ofrecidos.

- Determinar cómo se relacionan estas características con el precio y la calificación promedio de cada anuncio.

- Identificar los factores que más influyen en el número de reseñas y el nivel de satisfacción de los huéspedes.


**Problemática general**

La plataforma Airbnb enfrenta el desafío constante de optimizar la experiencia tanto de los huéspedes como de los anfitriones, al tiempo que busca maximizar la ocupación y rentabilidad de las propiedades disponibles en distintas regiones del mundo. Sin embargo, no siempre resulta evidente cómo las características físicas de un alojamiento, como el número de camas, baños o comodidades que influyen en su precio, nivel de satisfacción (rating) o volumen de reseñas recibidas.

El conjunto de datos contiene información detallada de cada propiedad, incluyendo variables como price, rating, reviews, features y amenities, además de datos asociados al anfitrión (host_name, host_id) y aspectos operativos como los horarios de check-in y check-out.

En este contexto, la problemática central se enfoca en comprender los factores que más inciden en el desempeño y la percepción de los alojamientos dentro del ecosistema de Airbnb, y cómo dichos factores pueden servir de base para tomar decisiones estratégicas informadas, tanto desde la gestión de anfitriones como desde la perspectiva de la propia plataforma.


**DATASET ESCOGIDO** 

**Link**

https://www.kaggle.com/datasets/ashishjangra27/airbnb-dataset?select=airbnb.csv

Se eligió el dataset de Airbnb porque ofrece una gran variedad de datos reales relacionados con el mercado de hospedaje, lo que permite analizar cómo diferentes factores como el precio, la ubicación, las características físicas y los servicios ofrecidos que influyen en la satisfacción y preferencia de los usuarios.Además, este conjunto de datos es ideal para aplicar técnicas de análisis de datos, visualización y modelado predictivo, ya que incluye variables numéricas, categóricas y textuales. Su riqueza y estructura permiten explorar relaciones complejas y obtener conclusiones útiles tanto para la toma de decisiones como para el aprendizaje práctico en ciencia de datos.  


### MODELO ENTIDAD RELACIÓN

![Modelo entidad relacion .png](.//workspaces/BigData/docs/statit/Modelo entidad relacion .png "Modelo entidad relacion .png")


Si de alguna manera no aparece la imagen esta en la ruta docs/static/modelo entidad relacion.png


**Diccionario de datos**

**id:** Identificador único para cada anuncio

**name:** Nombre del alojamiento en Airbnb

**rating:** Calificación promedio del alojamiento

**reviews:** Número de reseñas recibidas

**host_name:** Nombre del anfitrión

**host_id:** Identificador único del anfitrión

**address:** Ubicación del alojamiento (ciudad, región, país)

**features:** Resumen de características (número de huéspedes, habitaciones, camas, baños)

**amenities:** Lista de servicios ofrecidos

**price:** Precio por noche en la moneda local

**country:** País donde se encuentra el alojamiento

**bathrooms:** Número de baños

**beds:** Número de camas

**guests:** Número de huéspedes que el alojamiento puede recibir

**toilets:** Número de sanitarios

**bedrooms:** Número de habitaciones

**studios:** Número de unidades tipo estudio

**checkin:** Hora de entrada

**checkout:** Hora de salida

### INSTALACIÓN DE MODULO KAGGLE PARA PODER LLAMAR LOS DATASET DESDE LA PÁGINA   

In [0]:
!pip install kagglehub[pandas-datasets]>=0.3.8

    

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


### IMPORTACIONES REQUERIDAS PARA TRABAJAR 

In [0]:
import pandas as pd
import kagglehub
import os
import zipfile

### CÓDIGO PARA DESCARGAR DATASET, EXTRAER ARCHIVOS ZIP Y CREAR CSV

In [0]:
## Funciones para descargar y extraer datasets

def download_dataset_zip(url = ""):
        print("Descargando dataset desde Kaggle...")
        dataset_path = kagglehub.dataset_download(url)
        print("Ruta al dataset:", dataset_path)
        return dataset_path
    
## Extraer archivos ZIP
def extract_zip_files(dataset_path):
        zip_files = [f for f in os.listdir(dataset_path) if f.endswith('.zip')]
        if zip_files:
            zip_file = os.path.join(dataset_path, zip_files[0])
            extract_dir = os.path.join(dataset_path, "extracted")
            os.makedirs(extract_dir, exist_ok=True)
            print(f"Extrayendo {zip_file} en {extract_dir}...")
            with zipfile.ZipFile(zip_file, "r") as z:
                z.extractall(extract_dir)
            return extract_dir
        else:
            # Si no se encuentra un ZIP, se verifica si existen archivos CSV en la ruta
            csv_files = [f for f in os.listdir(dataset_path) if f.endswith('.csv')]
            if csv_files:
                print("No se encontró archivo ZIP pero se detectaron archivos CSV; se asume que el dataset ya se encuentra extraído.")
                return dataset_path
            else:
                raise FileNotFoundError("No se encontró ningún archivo .zip ni archivos .csv en la ruta del dataset")

## Crear CSV
def create_csv(csv_dir, csv_name=None):
    if csv_name:
        file_path = os.path.join(csv_dir, csv_name)
        print(f"Leyendo {file_path}...")
        df = pd.read_csv(file_path, encoding="latin1")
        print("CSV creado correctamente")
        return df
    else:
        csv_files = [f for f in os.listdir(csv_dir) if f.endswith('.csv')]
        if not csv_files:
            raise FileNotFoundError("No se encontraron archivos CSV en el directorio extraído")
        for file in csv_files:
            file_path = os.path.join(csv_dir, file)
            print(f"Leyendo {file_path}...")
            df = pd.read_csv(file_path, encoding="latin1")
        print("CSV creado correctamente")
        return df

### LLAMAMOS A LAS FUNCIONES ANTERIORES Y LE PASAMOS LOS PARAMETROS COMO LA URL DE DESCARGA Y EL NOMBRE DEL ARCHIVO PARA QUE LO DESCARGUE, Y LO SUBA CON FORMATO CSV 

In [0]:
df = pd.DataFrame()
dataset_path = download_dataset_zip("ashishjangra27/airbnb-dataset") 
csv_dir = extract_zip_files(dataset_path)
df = create_csv(csv_dir, csv_name="airbnb.csv")

## ELIMINAR COLUMNAS VACIAS
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]

Descargando dataset desde Kaggle...
Ruta al dataset: /home/spark-48c82d16-27d0-4128-8bfe-56/.cache/kagglehub/datasets/ashishjangra27/airbnb-dataset/versions/1
No se encontró archivo ZIP pero se detectaron archivos CSV; se asume que el dataset ya se encuentra extraído.
Leyendo /home/spark-48c82d16-27d0-4128-8bfe-56/.cache/kagglehub/datasets/ashishjangra27/airbnb-dataset/versions/1/airbnb.csv...
CSV creado correctamente


### VERIFICACIÓN DE CARGUE DEL ARCHIVO 

In [0]:
df.head(10)

Unnamed: 0,id,name,rating,reviews,host_name,host_id,address,features,amenities,safety_rules,hourse_rules,img_links,price,country,bathrooms,beds,guests,toiles,bedrooms,studios,checkin,checkout
0,49849504,Perla bungalov,4.71,64,Mehmetcan,357334205.0,"Kartepe, Kocaeli, Turkey","2 guests,2 bedrooms,1 bed,1 bathroom","Mountain view,Valley view,Lake access,Kitchen,...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: Flexible,Check out: 12:00 pm,Pets ar...",https://a0.muscache.com/im/pictures/a5da5cb7-c...,8078,Turkey,1,1,2,0,2,0,Flexible,12 00 pm
1,50891766,Authentic Beach Architect Sheltered Villa with...,New,0,Fatih,386223873.0,"KaÅ, Antalya, Turkey","4 guests,2 bedrooms,2 beds,2 bathrooms","Kitchen,Wifi,Dedicated workspace,Free parking ...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: 4:00 pm - 11:00 pm,Check out: 10:00 ...",https://a0.muscache.com/im/pictures/61b70855-2...,4665,Turkey,2,2,4,0,2,0,4 00 pm - 11 00 pm,10 00 am
2,50699164,cottages sataplia,4.85,68,Giorgi,409690853.0,"Imereti, Georgia","4 guests,1 bedroom,3 beds,1 bathroom","Mountain view,Kitchen,Wifi,Dedicated workspace...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: After 1:00 pm,Check out: 12:00 pm,Se...",https://a0.muscache.com/im/pictures/miso/Hosti...,5991,Georgia,1,3,4,0,1,0,After 1 00 pm,12 00 pm
3,49871422,Sapanca Breathable Bungalow,5.0,13,Melih,401873242.0,"Sapanca, Sakarya, Turkey","4 guests,1 bedroom,2 beds,1 bathroom","Mountain view,Valley view,Kitchen,Wifi,Free pa...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: After 2:00 pm,Check out: 12:00 pm,No...",https://a0.muscache.com/im/pictures/72e6396e-e...,11339,Turkey,1,2,4,0,1,0,After 2 00 pm,12 00 pm
4,51245886,Bungalov Ev 2,New,0,Arp Sapanca,414884116.0,"Sapanca, Sakarya, Turkey","2 guests,1 bedroom,1 bed,1 bathroom","Kitchen,Wifi,Free parking on premises,TV,Air c...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: After 2:00 pm,Check out: 12:00 pm,No...",https://a0.muscache.com/im/pictures/73973308-e...,6673,Turkey,1,1,2,0,1,0,After 2 00 pm,12 00 pm
5,48650769,CasaMia White Suite Treehouse,New,0,Casamia,261290482.0,"Sapanca, Sakarya, Turkey","2 guests,1 bedroom,2 beds,1 bathroom","Lake view,Mountain view,Waterfront,Wifi,Dedica...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: After 2:00 pm,Check out: 12:00 pm,No...",https://a0.muscache.com/im/pictures/miso/Hosti...,14729,Turkey,1,2,2,0,1,0,After 2 00 pm,12 00 pm
6,50765985,Ladin Bungalow,New,0,Stephen,15084529.0,"KaÅ, Antalya, Turkey","2 guests,1 bedroom,1 bed,1 bathroom","Garden view,Mountain view,Kitchen,Wifi,Dedicat...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: After 3:00 pm,Check out: 11:00 am,No...",https://a0.muscache.com/im/pictures/prohost-ap...,12312,Turkey,1,1,2,0,1,0,After 3 00 pm,11 00 am
7,40947216,Lavender House,New,0,Caner,318794897.0,"AkÃ§alÄ±, Giresun, Turkey","8 guests,1 bedroom,2 beds,1 bathroom","Wifi,Dedicated workspace,Free parking on premi...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: Flexible,Self check-in with building...",https://a0.muscache.com/im/pictures/97046585-1...,13655,Turkey,1,2,8,0,1,0,Flexible,
8,34043569,Prince's,New,0,Tu,221057563.0,"ThÃ nh phá» ÄÃ Láº¡t, LÃ¢m Äá»ng, Vietnam","2 guests,1 bedroom,1 bed,1 bathroom","Kitchen,Wifi,Dedicated workspace,Free parking ...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: After 1:00 pm,Check out: 11:00 am,Se...",https://a0.muscache.com/im/pictures/1bd2f3f8-0...,1747,Vietnam,1,1,2,0,1,0,After 1 00 pm,11 00 am
9,42075682,"The Cottage, Private Pool Villa",4.67,3,Sukanya,173126583.0,"Tambon Bang Kachai, Chang Wat Chanthaburi, Tha...","10 guests,4 bedrooms,6 beds,3 bathrooms","Kitchen,Wifi,Free parking on premises,Private ...","ó¹,Airbnb's COVID-19 safety practices apply,...","Check-in: After 3:00 pm,No smoking,No pets",https://a0.muscache.com/im/pictures/42d26843-4...,30486,Thailand,3,6,10,0,4,0,After 3 00 pm,


### CREACIÓN DE VISTA 

In [0]:

spark_df.createTempView("views_airbnb")
display(spark_df.limit(10))

id,name,rating,reviews,host_name,host_id,address,features,amenities,safety_rules,hourse_rules,img_links,price,country,bathrooms,beds,guests,toiles,bedrooms,studios,checkin,checkout
49849504,Perla bungalov,4.71,64,Mehmetcan,357334205.0,"Kartepe, Kocaeli, Turkey","2 guests,2 bedrooms,1 bed,1 bathroom","Mountain view,Valley view,Lake access,Kitchen,Wifi,Free parking on premises,Pets allowed,TV,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,No carbon monoxide alarm,ó± ,No smoke alarm,ó± ,Nearby lake, river, other body of water,Show more,Show more,","Check-in: Flexible,Check out: 12:00 pm,Pets are allowed,ó±¤,Smoking is allowed",https://a0.muscache.com/im/pictures/a5da5cb7-cac8-488f-85b1-a0dd78b28a86.jpg?im_w=720 https://a0.muscache.com/im/pictures/9aa4bc71-de8e-444a-b9fa-c2679a71a059.jpg?im_w=720 https://a0.muscache.com/im/pictures/81c2d325-fc04-40da-852f-9490df74b62b.jpg?im_w=720 https://a0.muscache.com/im/pictures/cf7f3f57-8a00-4397-acea-c891760d4b2f.jpg?im_w=720 https://a0.muscache.com/im/pictures/0fc82edc-c539-453b-8a46-268c988530fc.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,8078,Turkey,1,1,2,0,2,0,Flexible,12 00 pm
50891766,Authentic Beach Architect Sheltered Villa with Pool and Jacuzzi,New,0,Fatih,386223873.0,"KaÅ, Antalya, Turkey","4 guests,2 bedrooms,2 beds,2 bathrooms","Kitchen,Wifi,Dedicated workspace,Free parking on premises,Private pool,Private hot tub,TV,Washing machine,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Smoke alarm not reported Show more,Show more,Show more,","Check-in: 4:00 pm - 11:00 pm,Check out: 10:00 am,No smoking,No pets,No parties or events",https://a0.muscache.com/im/pictures/61b70855-2032-4d6d-8064-0209d37ba883.jpg?im_w=720 https://a0.muscache.com/im/pictures/37f85230-5464-4466-9445-c3301cd2dd1b.jpg?im_w=720 https://a0.muscache.com/im/pictures/21bdf86f-d954-4449-b24a-1beeedbcbf56.jpg?im_w=720 https://a0.muscache.com/im/pictures/dfb4fa9e-cb66-41b5-8698-344ac205b939.jpg?im_w=720 https://a0.muscache.com/im/pictures/b5edbce8-0148-42a0-9ce0-fede376fd2de.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,4665,Turkey,2,2,4,0,2,0,4 00 pm - 11 00 pm,10 00 am
50699164,cottages sataplia,4.85,68,Giorgi,409690853.0,"Imereti, Georgia","4 guests,1 bedroom,3 beds,1 bathroom","Mountain view,Kitchen,Wifi,Dedicated workspace,Free driveway parking on premises,Pets allowed,40"" HDTV with cable/satellite TV,Lift,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,No carbon monoxide alarm,ó± ,No smoke alarm,Show more,Show more,","Check-in: After 1:00 pm,Check out: 12:00 pm,Self check-in with lockbox,Pets are allowed",https://a0.muscache.com/im/pictures/miso/Hosting-50699164/original/195db67c-a858-497b-bd8f-1c23df17ee62.jpeg?im_w=720 https://a0.muscache.com/im/pictures/cfa43081-5f9b-42c6-ab30-3e238c920544.jpg?im_w=720 https://a0.muscache.com/im/pictures/a0a56c4e-9859-4b60-b585-0eafdc409bed.jpg?im_w=720 https://a0.muscache.com/im/pictures/ef86d12b-218c-4782-a8f6-fb542c5ef3e0.jpg?im_w=720 https://a0.muscache.com/im/pictures/25a7173b-486f-4013-ab95-928e9f90b40c.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,5991,Georgia,1,3,4,0,1,0,After 1 00 pm,12 00 pm
49871422,Sapanca Breathable Bungalow,5.0,13,Melih,401873242.0,"Sapanca, Sakarya, Turkey","4 guests,1 bedroom,2 beds,1 bathroom","Mountain view,Valley view,Kitchen,Wifi,Free parking on premises,Private pool,42"" HDTV with Netflix,Security cameras on property,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,No carbon monoxide alarm,ó± ,No smoke alarm,ó± ,Security camera/recording device Show more,ó± ,Nearby lake, river, other body of water,Show more,Show more,","Check-in: After 2:00 pm,Check out: 12:00 pm,No smoking,No pets",https://a0.muscache.com/im/pictures/72e6396e-e0c2-436c-b6b7-b985415cc31f.jpg?im_w=720 https://a0.muscache.com/im/pictures/db610024-07b4-4064-93e2-3c781bc6ca38.jpg?im_w=720 https://a0.muscache.com/im/pictures/bb6c23de-75ad-41ce-88dc-92656e67dcfa.jpg?im_w=720 https://a0.muscache.com/im/pictures/c6c8db17-4e1e-4aa0-b303-c27400dca7df.jpg?im_w=720 https://a0.muscache.com/im/pictures/e5652546-1683-409b-b43c-ebbb90877663.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,11339,Turkey,1,2,4,0,1,0,After 2 00 pm,12 00 pm
51245886,Bungalov Ev 2,New,0,Arp Sapanca,414884116.0,"Sapanca, Sakarya, Turkey","2 guests,1 bedroom,1 bed,1 bathroom","Kitchen,Wifi,Free parking on premises,TV,Air conditioning,Long-term stays allowed,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Smoke alarm not reported Show more,ó± ,Nearby lake, river, other body of water,Show more,Show more,","Check-in: After 2:00 pm,Check out: 12:00 pm,No smoking,No pets,No parties or events,Show more,Show more,",https://a0.muscache.com/im/pictures/73973308-e01c-4023-982e-2f8f7a0af015.jpg?im_w=720 https://a0.muscache.com/im/pictures/c6481689-ad69-4f9b-9a0b-7ccdab6a8b05.jpg?im_w=720 https://a0.muscache.com/im/pictures/0beaa85e-a12b-43c8-a27f-4502bee54146.jpg?im_w=720 https://a0.muscache.com/im/pictures/fa25e8fa-fe03-4381-8876-a29c742be46a.jpg?im_w=720 https://a0.muscache.com/im/pictures/41a2b1b5-fd15-449a-a874-1297208b2686.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,6673,Turkey,1,1,2,0,1,0,After 2 00 pm,12 00 pm
48650769,CasaMia White Suite Treehouse,New,0,Casamia,261290482.0,"Sapanca, Sakarya, Turkey","2 guests,1 bedroom,2 beds,1 bathroom","Lake view,Mountain view,Waterfront,Wifi,Dedicated workspace,Free parking on premises,Private pool,HDTV with Netflix,Security cameras on property,Unavailable: Carbon monoxide alarmCarbon monoxide alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,No carbon monoxide alarm,ó± ,Security camera/recording device Show more,ó± ,Nearby lake, river, other body of water,ó± ,Climbing or play structure,Show more,Show more,","Check-in: After 2:00 pm,Check out: 12:00 pm,No smoking,No pets",https://a0.muscache.com/im/pictures/miso/Hosting-48650769/original/9ac7b33c-82af-4040-b71c-fbedd99d9d90.jpeg?im_w=720 https://a0.muscache.com/im/pictures/miso/Hosting-48650769/original/c15ca735-fea5-4987-bfef-a98167a38328.jpeg?im_w=720 https://a0.muscache.com/im/pictures/miso/Hosting-48650769/original/80f854eb-da6a-4be3-b302-0492dafd5d1f.jpeg?im_w=720 https://a0.muscache.com/im/pictures/miso/Hosting-48650769/original/45316c73-997a-4d21-a2e2-236ce275cad3.jpeg?im_w=720 https://a0.muscache.com/im/pictures/miso/Hosting-48650769/original/bde6c0a9-6aed-4aa5-bfde-a0fba7f6c7d8.jpeg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,14729,Turkey,1,2,2,0,1,0,After 2 00 pm,12 00 pm
50765985,Ladin Bungalow,New,0,Stephen,15084529.0,"KaÅ, Antalya, Turkey","2 guests,1 bedroom,1 bed,1 bathroom","Garden view,Mountain view,Kitchen,Wifi,Dedicated workspace,Free parking on premises,Pool,Pets allowed,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Smoke alarm not reported Show more,Show more,Show more,","Check-in: After 3:00 pm,Check out: 11:00 am,No smoking,Pets are allowed",https://a0.muscache.com/im/pictures/prohost-api/Hosting-50765985/original/ff995cb6-e5dd-43c9-8448-fac2efef3271.jpeg?im_w=720 https://a0.muscache.com/im/pictures/prohost-api/Hosting-50765985/original/3be0bc25-4b3b-45df-b52b-2f9cdcf87b1d.jpeg?im_w=720 https://a0.muscache.com/im/pictures/prohost-api/Hosting-50765985/original/410bd1fd-b341-4641-9793-9350873a9c2b.jpeg?im_w=720 https://a0.muscache.com/im/pictures/prohost-api/Hosting-50765985/original/dffeb26c-9e1a-4c7f-b8e9-b12854032b9a.jpeg?im_w=720 https://a0.muscache.com/im/pictures/prohost-api/Hosting-50765985/original/08bf9b96-95c1-4f89-a10a-64de31f82ff1.jpeg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,12312,Turkey,1,1,2,0,1,0,After 3 00 pm,11 00 am
40947216,Lavender House,New,0,Caner,318794897.0,"AkÃ§alÄ±, Giresun, Turkey","8 guests,1 bedroom,2 beds,1 bathroom","Wifi,Dedicated workspace,Free parking on premises,Pets allowed,Indoor fireplace,Hair dryer,Fire pit,Long-term stays allowed,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Smoke alarm not reported Show more,Show more,Show more,","Check-in: Flexible,Self check-in with building staff,No smoking,Pets are allowed,Show more,Show more,",https://a0.muscache.com/im/pictures/97046585-1b10-442d-b99a-d2a4be9ce892.jpg?im_w=720 https://a0.muscache.com/im/pictures/2c45e585-8707-49aa-8b3c-ec0757942700.jpg?im_w=720 https://a0.muscache.com/im/pictures/4b568352-598d-49a7-9753-f147f11413a2.jpg?im_w=720 https://a0.muscache.com/im/pictures/2c8909d0-3af7-42bd-8c3c-336043c341b5.jpg?im_w=720 https://a0.muscache.com/im/pictures/5bf3f0fd-a6de-4845-b837-9c3c1b64c9e9.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg https://a0.muscache.com/im/pictures/fd2af060-96be-4144-a077-fe57bb1d2912.jpg?im_w=720,13655,Turkey,1,2,8,0,1,0,Flexible,
34043569,Prince's,New,0,Tu,221057563.0,"ThÃ nh phá» ÄÃ Láº¡t, LÃ¢m Äá»ng, Vietnam","2 guests,1 bedroom,1 bed,1 bathroom","Kitchen,Wifi,Dedicated workspace,Free parking on premises,Pets allowed,TV,Washing machine,Dryer,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,No smoke alarm,ó± ,Carbon monoxide alarm not reported Show more,Show more,Show more,","Check-in: After 1:00 pm,Check out: 11:00 am,Self check-in with lockbox,Pets are allowed,ó±¤,Smoking is allowed",https://a0.muscache.com/im/pictures/1bd2f3f8-0776-46fc-869c-d54dfb26853d.jpg?im_w=720 https://a0.muscache.com/im/pictures/6d969a6c-4afb-44a9-aa3f-838094b22977.jpg?im_w=720 https://a0.muscache.com/im/pictures/53e46e9f-a0d0-46d0-895f-739587dc273e.jpg?im_w=720 https://a0.muscache.com/im/pictures/3290241c-66bc-48ff-b7aa-0ed9fbd5bfab.jpg?im_w=720 https://a0.muscache.com/im/pictures/d3292e15-c2d7-450c-85ae-bdfdfedf884e.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,1747,Vietnam,1,1,2,0,1,0,After 1 00 pm,11 00 am
42075682,"The Cottage, Private Pool Villa",4.67,3,Sukanya,173126583.0,"Tambon Bang Kachai, Chang Wat Chanthaburi, Thailand","10 guests,4 bedrooms,6 beds,3 bathrooms","Kitchen,Wifi,Free parking on premises,Private pool,TV,Air conditioning,Breakfast,Long-term stays allowed,Unavailable: Carbon monoxide alarmCarbon monoxide alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,Smoke alarm,Show more,Show more,","Check-in: After 3:00 pm,No smoking,No pets",https://a0.muscache.com/im/pictures/42d26843-493a-4690-9817-5e4316f8f1ea.jpg?im_w=720 https://a0.muscache.com/im/pictures/2cf0fb84-e01e-4439-8635-e8c0cca9396e.jpg?im_w=720 https://a0.muscache.com/im/pictures/5fb7483d-dd26-4343-8e86-00d4d916840a.jpg?im_w=720 https://a0.muscache.com/im/pictures/2b1e192e-9158-4750-9c6a-24c94bd5faab.jpg?im_w=720 https://a0.muscache.com/im/pictures/c7e3ed6d-47b4-48fb-b9be-dbd95628b01a.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,30486,Thailand,3,6,10,0,4,0,After 3 00 pm,


###CONVERTIR EL DATAFRAME DE PANDAS A SPARK

In [0]:

spark_df = spark.createDataFrame(df)

### CREAR TABLA EN SPARK


In [0]:
spark_df.write.mode("overwrite").saveAsTable('tbl_airbnb')

### CONSULTAS DE SQL DE LA TABLA AIRBNB

### CONSULTA 1 - descripción de las columnas y tipos de datos 

In [0]:
%sql
describe table tbl_airbnb;

col_name,data_type,comment
id,bigint,
name,string,
rating,string,
reviews,string,
host_name,string,
host_id,double,
address,string,
features,string,
amenities,string,
safety_rules,string,


### CONSULTA 2 - traer los 20 primeros registros con un raking mayor a 4.0 y un precio mayor 20.000 

In [0]:
%sql
SELECT * 
FROM tbl_airbnb WHERE try_cast(rating AS DOUBLE) > 4.0 and price > 20000
LIMIT 20;

id,name,rating,reviews,host_name,host_id,address,features,amenities,safety_rules,hourse_rules,img_links,price,country,bathrooms,beds,guests,toiles,bedrooms,studios,checkin,checkout
32300068,Dammuso on the sea in Martingana,5.0,4,Mariangela,240347675.0,"Martingana, Sicilia, Italy","2 guests,1 bedroom,1 bed,1 bathroom","Beach access â Beachfront,Kitchen,Wifi,Dedicated workspace,Free parking on premises,TV,Air conditioning,Patio or balcony,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,No carbon monoxide alarm,ó± ,No smoke alarm,Show more,Show more,","Check out: 10:00 am,No pets,No parties or events,ó±¤,Smoking is allowed",https://a0.muscache.com/im/pictures/miso/Hosting-32300068/original/95186eab-0060-408e-8bd2-edf199ba5591.jpeg?im_w=720 https://a0.muscache.com/im/pictures/12189b4e-d64a-4111-ba22-e08f8ca3c3bb.jpg?im_w=720 https://a0.muscache.com/im/pictures/802d3e8f-cf0d-48a3-aefe-36dec34c61c3.jpg?im_w=720 https://a0.muscache.com/im/pictures/44da3b6c-5586-484a-9ef8-1b078b956d9b.jpg?im_w=720 https://a0.muscache.com/im/pictures/40f67100-aa8d-40b9-a942-d64ba1f626b6.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg https://a0.muscache.com/im/pictures/a9bc9ac7-fc4d-4225-8bc5-5d12882c8ceb.jpg?im_w=720,22722,Italy,1,1,2,0,1,0,,10 00 am
24457840,Dammuso with independent pool,5.0,3,Francesca,9750394.0,"Pantelleria, Sicilia, Italy","8 guests,4 bedrooms,4 beds,4 bathrooms","Kitchen,Free parking on premises,Private pool,Pets allowed,TV,Washing machine,Private patio or balcony,Garden,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Smoke alarm not reported Show more,Show more,Show more,","Check-in: Flexible,Check out: 12:00 am,No parties or events,Pets are allowed,ó±¤,Smoking is allowed",https://a0.muscache.com/im/pictures/a90fab8b-5625-4050-8634-4f37e087097b.jpg?im_w=720 https://a0.muscache.com/im/pictures/233cf377-04a5-4f2f-a995-93b3bacf7011.jpg?im_w=720 https://a0.muscache.com/im/pictures/27da0d6b-410e-45da-973c-0552d1c89b2a.jpg?im_w=720 https://a0.muscache.com/im/pictures/1ef6893b-a616-4799-a9a3-ffdd9995bb69.jpg?im_w=720 https://a0.muscache.com/im/pictures/059e1061-a0fb-4eb1-9841-86d6ff3b9ff7.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,45411,Italy,4,4,8,0,4,0,Flexible,12 00 am
1277171,Residence Bugeber- Gaia's,5.0,8,Marina,6782187.0,"Pantelleria, Sicily, Italy","6 guests,2 bedrooms,3 beds,2 bathrooms","Kitchen,Free parking on premises,Shared pool,Luggage drop-off allowed,Travel cot,Childrenâs books and toys,Hair dryer,Smoking allowed,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Smoke alarm not reported Show more,Show more,Show more,","Check-in: After 12:00 pm,Check out: 10:00 am,ó±¤,Smoking is allowed,Show more,Show more,",https://a0.muscache.com/im/pictures/cc756c91-a82e-4f4c-95c7-b294f3a2cbf1.jpg?im_w=720 https://a0.muscache.com/im/pictures/22341236/11651a74_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/19255169/22b03518_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/19255185/8eca9ec8_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/a0d8049c-26ec-4b14-9f22-757a32d83670.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,29443,Italy,2,3,6,0,2,0,After 12 00 pm,10 00 am
39113140,Treehouse De Valentine,4.91,106,Mikheyla Fox,30842469.0,"Balamban, Central Visayas, Philippines","8 guests,3 bedrooms,5 beds,2 bathrooms","Kitchen,Free parking on premises,Hot tub,Pets allowed,TV,Patio or balcony,Garden,Luggage drop-off allowed,Security cameras on property,Unavailable: Carbon monoxide alarmCarbon monoxide alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Security camera/recording device Show more,Smoke alarm,Show more,Show more,","Check-in: After 3:00 pm,Check out: 11:00 am,No smoking,Pets are allowed,Show more,Show more,",https://a0.muscache.com/im/pictures/miso/Hosting-39113140/original/b133420b-e520-426a-ba44-0949dc2ab4ba.jpeg?im_w=720 https://a0.muscache.com/im/pictures/miso/Hosting-39113140/original/b7dcb067-d129-4c16-966d-a717b08bfc06.jpeg?im_w=720 https://a0.muscache.com/im/pictures/miso/Hosting-39113140/original/94822928-7889-4dec-94c6-9a171a66e4c2.jpeg?im_w=720 https://a0.muscache.com/im/pictures/miso/Hosting-39113140/original/6f703483-0354-43e4-9648-3fe7f5019b21.jpeg?im_w=720 https://a0.muscache.com/im/pictures/miso/Hosting-39113140/original/66683154-2bc0-4319-b6da-cdc1e9e4cab4.jpeg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,20537,Philippines,2,5,8,0,3,0,After 3 00 pm,11 00 am
14032398,Ã Auge - River Eye - Treehouse,4.92,379,Isaac,24972842.0,"Tinn, Telemark, Norway","7 guests,3 beds,1 bathroom","Waterfront,Kitchen,Free parking on premises,Pets allowed,Patio or balcony,Garden,Indoor fireplace,Long-term stays allowed","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Nearby lake, river, other body of water,Carbon monoxide alarm,Smoke alarm,Show more,Show more,","Check-in: After 4:00 pm,Check out: 12:00 pm,No smoking,Pets are allowed",https://a0.muscache.com/im/pictures/f92763a5-4766-4521-8fa7-ff4943ccb74f.jpg?im_w=720 https://a0.muscache.com/im/pictures/5758245f-6b05-4095-b1af-7edb316c80af.jpg?im_w=720 https://a0.muscache.com/im/pictures/2dd4132f-ab48-4d1c-b237-52861207cce6.jpg?im_w=720 https://a0.muscache.com/im/pictures/3162ff6a-af8f-46ea-a869-d7d088e0afd2.jpg?im_w=720 https://a0.muscache.com/im/pictures/06446812-8e26-4e94-9ca7-1f59939668ca.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,23529,Norway,1,3,7,0,0,0,After 4 00 pm,12 00 pm
17254131,Amazing panoramic tree house Norway,4.85,374,Olav,14442784.0,"Sandane , Sogn og Fjordane, Norway","2 guests,1 bedroom,1 bed,1 bathroom","Mountain view,Ocean view,Beach access,Kitchen,Free parking on premises,Shared patio or balcony,Shared backyard â Not fully fenced,Fire pit,Long-term stays allowed","ó¹,Airbnb's COVID-19 safety practices apply,Carbon monoxide alarm,Smoke alarm,Show more,Show more,","Check-in: 3:00 pm - 10:00 pm,Check out: 12:00 pm,Self check-in with lockbox,No smoking,No pets,No parties or events",https://a0.muscache.com/im/pictures/b03eae1b-ef3a-4d08-ad4e-670bb35f5f54.jpg?im_w=720 https://a0.muscache.com/im/pictures/d945546b-672c-4747-8cb9-599f609920ab.jpg?im_w=720 https://a0.muscache.com/im/pictures/93f6c3a1-9868-4bde-876c-105098c251fb.jpg?im_w=720 https://a0.muscache.com/im/pictures/769a777e-86cd-4f80-83b8-7862f4b24111.jpg?im_w=720 https://a0.muscache.com/im/pictures/67a740f5-a077-4605-93d3-fb0e60766bc7.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,23434,Norway,1,1,2,0,1,0,3 00 pm - 10 00 pm,12 00 pm
6668057,Big tree house in the Vosges,4.58,85,Ewoud,32257422.0,"Saint-DiÃ©-des-Vosges, Lorraine, France","6 guests,1 bedroom,3 beds,1 bathroom","Valley view,Kitchen,Wifi,Free parking on premises,TV,Patio or balcony,Garden,Indoor fireplace,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Smoke alarm not reported Show more,Show more,Show more,","Check-in: 5:00 pm - 7:00 pm,Check out: 10:00 am,No pets",https://a0.muscache.com/im/pictures/84073387/26572c9d_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/84073418/ce55ab88_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/84073436/6af1e6d6_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/84073452/e3b5ad60_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/84073466/aaa164d8_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,20950,France,1,3,6,0,1,0,5 00 pm - 7 00 pm,10 00 am
44567837,MjÃ¸sglÃ¥t -Unique treehouse - FarÃ¥sen Treehouse,5.0,12,Ruben,56386285.0,"Ringsaker, Innlandet, Norway","9 guests,2 bedrooms,6 beds,1 bathroom","Kitchen,Free parking on premises,Washing machine,Private patio or balcony,Private backyard â Not fully fenced,Indoor fireplace,Fire pit,Refrigerator,Long-term stays allowed,Unavailable: Carbon monoxide alarmCarbon monoxide alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,Smoke alarm,Show more,Show more,","Check-in: 4:00 pm - 12:00 am,Check out: 1:00 pm,No smoking,No pets,No parties or events",https://a0.muscache.com/im/pictures/044ab2d6-8163-42b3-b315-eafd722acfaf.jpg?im_w=720 https://a0.muscache.com/im/pictures/7d00426e-63d9-4d35-a3be-d5ac2fe31f53.jpg?im_w=720 https://a0.muscache.com/im/pictures/dbad66a2-cbff-4388-96ef-55c51ae5799c.jpg?im_w=720 https://a0.muscache.com/im/pictures/86286d8f-6356-4c48-b62f-03a92850f12b.jpg?im_w=720 https://a0.muscache.com/im/pictures/6b7b75ab-0bb0-4435-b87d-43197b0216a3.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg https://a0.muscache.com/im/pictures/f2c3f8dd-d9da-4a50-abd6-68c8073c5d8a.jpg?im_w=720 https://a0.muscache.com/im/pictures/dbad66a2-cbff-4388-96ef-55c51ae5799c.jpg?im_w=720,28204,Norway,1,6,9,0,2,0,4 00 pm - 12 00 am,1 00 pm
31815592,Casang Wheels - La Cabane,5.0,23,Christine,53026448.0,"Porto-Vecchio, Corse, France","2 guests,1 bedroom,1 bed,1 bathroom","Kitchen,Wifi,Free parking on premises,Shared pool,Private hot tub,Pets allowed,TV with standard cable/satellite,Washing machine,Dryer,Air conditioning","ó¹,Airbnb's COVID-19 safety practices apply,Carbon monoxide alarm,Smoke alarm,Show more,Show more,","Check-in: 4:00 pm - 12:00 am,Check out: 11:00 am,No smoking,No parties or events,Pets are allowed,Show more,Show more,",https://a0.muscache.com/im/pictures/999af14b-50e0-439d-8972-90570a2cf77e.jpg?im_w=720 https://a0.muscache.com/im/pictures/82e15a8c-5fea-4da3-bad7-1a029063dd4c.jpg?im_w=720 https://a0.muscache.com/im/pictures/157a0b81-e7d4-4e3e-a840-2b692377af4d.jpg?im_w=720 https://a0.muscache.com/im/pictures/ae57fb08-77de-4031-98b2-54d039ef866d.jpg?im_w=720 https://a0.muscache.com/im/pictures/6d0be772-4778-49bd-aa48-c8ac492ace65.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg https://a0.muscache.com/im/pictures/999af14b-50e0-439d-8972-90570a2cf77e.jpg?im_w=720,27390,France,1,1,2,0,1,0,4 00 pm - 12 00 am,11 00 am
2065430,Real Treehouse like in your dreams,4.69,16,Alkim,2517230.0,"TekirdaÄ, Tekirdag, Turkey","4 guests,1 bedroom,2 beds,1.5 bathrooms","Kitchen,Wifi,Free parking on premises,Pool,Pets allowed,TV with standard cable/satellite,Washing machine,Dryer,Unavailable: Carbon monoxide alarmCarbon monoxide alarm,Unavailable: Smoke alarmSmoke alarm","ó¹,Airbnb's COVID-19 safety practices apply,ó± ,Carbon monoxide alarm not reported Show more,ó± ,Smoke alarm not reported Show more,Show more,Show more,","Check-in: After 2:00 pm,Check out: 12:00 pm,No smoking,No parties or events,Pets are allowed",https://a0.muscache.com/im/pictures/28235745/6706c8a7_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/28234431/4e158a15_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/28234545/8da72595_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/28234667/156f515b_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/28234828/81e16333_original.jpg?im_w=720 https://a0.muscache.com/im/pictures/54e427bb-9cb7-4a81-94cf-78f19156faad.jpg,22837,Turkey,1,2,4,0,1,0,After 2 00 pm,12 00 pm


### CONSULTA 3 - contar cuantos registros hay en total en toda la tabla tbl_airbnb

In [0]:
%sql

SELECT COUNT(*) FROM tbl_airbnb;

COUNT(*)
12805


### CONSULTA 4 - traer los primeros 20 registros de los campos nombre,raking, nombre del anfitrión , dirección, precio , país de la tabla airbnb donde el numero de cuartos sea mayor a 2

In [0]:
%sql
SELECT name, rating, host_name , address , price , country from tbl_airbnb where  bedrooms >= 2 limit 20;

name,rating,host_name,address,price,country
Dammuso immerso nel Parco dei Sesi,New,Roberta,"Pantelleria, Sicilia, Italy",18248,Italy
"Dammuso LA PALMA, Pantelleria",4.83,Alberto,"Pantelleria, Sicily, Italy",12147,Italy
Dammuso Le Lantane,5.0,Andrea,"Khamma, Sicilia, Italy",14592,Italy
Dammuso di Giorgia,5.0,Adele,"Pantelleria, Sicilia, Italy",12684,Italy
Dammuso with independent pool,5.0,Francesca,"Pantelleria, Sicilia, Italy",45411,Italy
Dammuso Pantelleria Mandorlo,4.7,Giacomo,"Pantelleria, Sicily, Italy",3171,Italy
Dammuso della Luna: The magic of starry nights,New,Enrica,"Pantelleria, Sicilia, Italy",66590,Italy
Dammuso Scirafi sunset 1,4.75,Debora,"Pantelleria, Sicilia, Italy",14224,Italy
Dammuso il Cucciolo - Ocean view villa,New,Federica,"Pantelleria, Italy",28992,Italy
DAMMUSO EOLO PANTELLERIA,New,Guenda Giulia,"Pantelleria, Sicilia, Italy",71029,Italy


> ### CONSULTA 5 - traer los 20 primeros registros de los paises con el promedio del precio

In [0]:
%sql
SELECT country, ROUND(AVG(price), 2) AS avg_price
FROM tbl_airbnb
GROUP BY country
ORDER BY avg_price DESC limit 20;

country,avg_price
Seychelles,155225.11
Honduras,140998.5
United Arab Emirates,119633.65
Bahamas,116397.5
Qatar,112717.5
Belize,111575.15
Belize,110603.0
Colombia,93451.15
Jamaica,82766.0
Maldives,82202.25
