# Conversion of the Latest Dataframe to Parquet

We need to store our dataset in a warehouse so we use parquet

In [11]:
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq

# Load the Pandas DataFrame
df = pd.read_csv('dialogues.csv', sep='\t', encoding='utf-8')

In [12]:
# Convert Pandas DataFrame to Arrow Table
table = pa.Table.from_pandas(df)
# Specify the output file path for the Parquet file
parquet_file_path = './data/parquet/dialogues.parquet'

# Write the Arrow Table to a Parquet file
pq.write_table(table, parquet_file_path)

print(f'DataFrame saved to Parquet file: {parquet_file_path}')

DataFrame saved to Parquet file: ./data/parquet/dialogues.parquet


In [13]:
# Read Parquet file into Arrow Table
table = pq.read_table(parquet_file_path)

# Convert Arrow Table to Pandas DataFrame
df = table.to_pandas()


In [15]:
df.head()

Unnamed: 0,Description,Patient,Doctor
0,Q. What does abutment of the nerve root mean?,"Hi doctor,I am just wondering what is abutting...",Hi. I have gone through your query with dilige...
1,Q. What should I do to reduce my weight gained...,"Hi doctor, I am a 22-year-old female who was d...",Hi. You have really done well with the hypothy...
2,Q. I have started to get lots of acne on my fa...,Hi doctor! I used to have clear skin but since...,Hi there Acne has multifactorial etiology. Onl...
3,Q. Why do I have uncomfortable feeling between...,"Hello doctor,I am having an uncomfortable feel...",Hello. The popping and discomfort what you fel...
4,Q. My symptoms after intercourse threatns me e...,"Hello doctor,Before two years had sex with a c...",Hello. The HIV test uses a finger prick blood ...
