-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] #185
Comments
I can't reproduce this. In [1]: # Import Python modules
...: import pandas as pd
...: import numpy as np
...: from datar.data import starwars
...: from datar.all import *
In [2]: starwars >> count(f.species)
Out[2]:
species n
<object> <int64>
0 Human 35
1 Droid 6
2 Wookiee 2
3 Rodian 1
4 Hutt 1
5 Yoda's species 1
6 Trandoshan 1
... Can you provide the output of |
Hey pwwang, Sure! Thanks a lot for taking a look into this issue! The output of data.getversions() is as follows: Thanks again & Cheers, Gernot |
The problem should be the version of pandas. pandas 2.x uses pyarrow as backend, while pandas 1.x uses numpy. They are different in a lot of ways. We need a separate backend to support pandas 2. |
datar version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of datar and its backends.
Issue Description
Hey pwwang,
My apologies that I couldn't figure out the root problem myself. When trying to reproduce one of your examples using the starwars dataset using the count function it doesn't show the counts (i.e. the n column) for each category (as demonstrated in your example: https://pwwang.github.io/datar/notebooks/count/). My code is the following:
This code leads to this outcome (missing the n column):
![Screenshot 2023-08-20 at 15 50 10](https://private-user-images.githubusercontent.com/92456300/261859329-4b840093-fa92-4160-8b19-cf3178233cab.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjEzNDAyMDksIm5iZiI6MTcyMTMzOTkwOSwicGF0aCI6Ii85MjQ1NjMwMC8yNjE4NTkzMjktNGI4NDAwOTMtZmE5Mi00MTYwLThiMTktY2YzMTc4MjMzY2FiLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzE4VDIxNTgyOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWI4OTc3ZGNlNTgwNzc1NzA1NmRhOGU4YWUxZWEyYzY5ODU5YWRlZmY5ZGNkZTUzZDc4ZTdmZDE0Njk0MGI3MzkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.btY64QhM14YQhsQlSpNMbt9nl7h4ulTkPoVgUO5ULdY)
Expected Behavior
Expected behaviour: to generate the output with the count in a seperate column named n as shown in your tutorial example: https://pwwang.github.io/datar/notebooks/count/
Any hint how I could fix this would be highly appreciated! Many thanks in advance! All the best, Gernot
Installed Versions
---- Update
Hey pwwang! Thanks a lot for taking a look into this issue! The output of data.getversions() is as follows:
python : 3.10.5 (main, Oct 7 2022, 13:57:40) [Clang 14.0.0 (clang-1400.0.29.102)] datar : 0.13.1 simplug : 0.3.2 executing : 1.2.0 pipda : 0.12.0 datar-numpy : 0.2.1 numpy : 1.25.2 datar-pandas: 0.3.1 pandas : 2.0.3
Thanks again & Cheers, Gernot
The text was updated successfully, but these errors were encountered: