From: https://github.com/apache/parquet-format/blob/master/LogicalTypes.md

TIMESTAMP
In data annotated with the TIMESTAMP logical type, each value is a single int64 number that can be decoded into year, month, day, hour, minute, second and subsecond fields using calculations detailed below. Please note that a value defined this way does not necessarily correspond to a single instant on the time-line and such interpretations are allowed on purpose.

The TIMESTAMP type has two type parameters:

isAdjustedToUTC must be either true or false.
unit must be one of MILLIS, MICROS or NANOS. This list is subject to potential expansion in the future. Upon reading, unknown unit-s must be handled as unsupported features (rather than as errors in the data files).
Instant semantics (timestamps normalized to UTC)
A TIMESTAMP with isAdjustedToUTC=true is defined as the number of milliseconds, microseconds or nanoseconds (depending on the unit parameter being MILLIS, MICROS or NANOS, respectively) elapsed since the Unix epoch, 1970-01-01 00:00:00 UTC. Each such value unambiguously identifies a single instant on the time-line.

In [None]:
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq

# Create a sample dataset
data = {
    'id': [1, 2, 3, 4, 5],
    'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'date_of_birth': pd.to_datetime(['1990-01-01', '1985-05-15', '1992-07-30', '1988-03-22', '1995-12-10'])
}

df = pd.DataFrame(data)

# Convert the Pandas DataFrame to an Arrow Table
table = pa.Table.from_pandas(df)

# Write the table to a Parquet file
pq.write_table(table, 'sample_dataset.parquet')

# Read the Parquet file back into a Pandas DataFrame
df_read = pd.read_parquet('sample_dataset.parquet')

# Query the date_of_birth column
result = df_read[df_read['date_of_birth'] > '1990-01-01']

print("Parquet file written and queried successfully.")
print(result)
