Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset used in examples jupyter notebook is failing when jupyter notebook is run #6982

Closed
arunjose696 opened this issue Feb 29, 2024 · 1 comment · Fixed by #6993
Closed

Comments

@arunjose696
Copy link
Collaborator

The dataset used in the notebook is named yellow_tripdata_2015-01.csv, which is hosted in
https://modin-datasets.intel.com/testing/yellow_tripdata_2015-01.csv this dataset is not the expected yellow_tripdata_2015-01.csv so the jupyter notebook fails as mentioned in #6964 (comment)_,

Either the hosted dataset should be changed or the snipet should be updated

@arunjose696
Copy link
Collaborator Author

The official source of yellow taxi dataset is https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page, on cross checking the data dictionary of this dataset data_dictionary_trip_records_yellow.pdf (nyc.gov), it could be concluded the data used in the jupyter notebook is not yellow taxi as Dropoff_longitude column was expected in data.

Debugging further could observe the actual data expected was Green taxi thus updating the dataset used in the jupyter notebook with #6993

YarShev pushed a commit that referenced this issue Mar 4, 2024
…dataset (#6993)

Signed-off-by: arunjose696 <arunjose696@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant