-
-
Notifications
You must be signed in to change notification settings - Fork 90
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the unexpected behaviour
When querying a parquet files with chdb, strings become bytes in dataframe and arrow format. I don't this issue with JSON or CSV.
How to reproduce
The query
SELECT AVG(prix) as prix_moy,
pdvid,
name,
ville,
type_carburant
FROM s3('https://********.s3.eu-west-1.amazonaws.com/instantane.parquet', 'Parquet') AS p
LEFT JOIN s3('https://************.s3.eu-west-1.amazonaws.com/station.csv', '*****', '****', 'CSVWithNames') AS stations
ON p.pdvid = stations.id
GROUP BY all
ORDER BY prix_moy DESC;
The results with clickhouse local

The result with chdb
In [25]: res
Out[25]:
prix_moy pdvid name ville type_carburant
0 2.799 49480005 b"BP A11 AIRE DES PORTES D'ANGERS SUD" b"Saint-Sylvain-D'Anjou" b'SP98'
1 2.770 75014008 None b'Paris' b'SP98'
2 2.740 75014008 None b'Paris' b'SP95'
3 2.699 49160003 b'SARL ROUX' b'Longu\xc3\xa9-Jumelles' b'SP98'
4 2.690 75016011 b'Sarl STATION KLEBER' b'Paris' b'SP98'
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working