Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add guideline for how to get started when using data in CSV files #34

Closed
cchwala opened this issue Mar 11, 2024 · 3 comments · Fixed by #55
Closed

Add guideline for how to get started when using data in CSV files #34

cchwala opened this issue Mar 11, 2024 · 3 comments · Fixed by #55
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers

Comments

@cchwala
Copy link
Member

cchwala commented Mar 11, 2024

Since many people start with CSV data and since the process to get a xarray.Dataset is a bit complex if doing this for the first time, we should have an example notebook which gives a step-by-step guide.

We can maybe use a variation of this code from the sandbox to show the relevant use of xarray.

@cchwala cchwala added the enhancement New feature or request label Mar 11, 2024
@cchwala
Copy link
Member Author

cchwala commented Mar 12, 2024

Up for discussion:

  • Should we already resample with pandas before building the xarray.Dataset?
  • Anyway we need to align the time dimension before building the xarray.Dataset, e.g. because (some) PWS data will come at more-or-less equidistant time stamps at 5-minute plus some seconds.

@cchwala cchwala added documentation Improvements or additions to documentation good first issue Good for newcomers labels Mar 18, 2024
@cchwala
Copy link
Member Author

cchwala commented Mar 18, 2024

@fenclmar The work to solve this issue, is strongly related to the planed WG1 work on on conversion of datasets, in case they stem from CSV files. Hence, it might make sense to first solve this issue and then build on top of that.

@cchwala
Copy link
Member Author

cchwala commented Jun 4, 2024

Based on discussion (during video call today) we do the following:

  • add example notebook that presents a kind of recipe how to get from CSV data to xarray.Dataset in correct format
  • do it for PWS first based on trainging school preparation notebook
    • adjust existing code to use several individual CSV files for individual station (e.g. ten stations for short period)
    • add example code for handling of duplicate coordinates and missing time steps using xarray and/or pandas, related to Functionality for getting duplicated coordinates #52
  • (later add approach for CML data from CSV, could be done e.g. with E-Band data from CZ)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers
Projects
None yet
2 participants