Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for year input for Source methods #273

Merged
merged 6 commits into from
Apr 22, 2024

Conversation

imrnmzri
Copy link
Contributor

Solves #269 by adding a documentation in the "Getting Started" guide on how to include data for multiple years., which is done by including the years as a list using the format [Start Date, End Date]

Copy link
Collaborator

@sowdm sowdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely makes the year input clearer. Thanks!

I made a couple minor suggestions around line 1167.

"id": "762799a5-5b99-46ca-8b68-401dfdf10100",
"metadata": {},
"source": [
"To include multiple years of data, you can include the years in the \"year\" parameter in the form of [Start Year, Stop Year]\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor suggestions:

  1. Maybe "To request" instead of "To include"?
  2. Date ranges can only be used for multi-year datasets. I suggest adding something to indicate that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! I'll reword the statement.

"source": [
"To include multiple years of data, you can include the years in the \"year\" parameter in the form of [Start Year, Stop Year]\n",
"\n",
"This also works in a dataset with only a single year of data"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you can drop the 2nd sentence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested to see how the code would run if I were to query a single year dataset using a multi-year input. I notice that if I include a year that does not exist in a multi-year dataset (say 2021-2027), the output will still be from 2021-2024, without any warning on the non-exist years. So I assumed that this would work on a single year dataset?

I might need to find a single year dataset and observe its output

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way that the year input works rights now needs to be improved to be clearer. If you have any suggestions how to make it clearer or how to better explain it in our documentation, let us know. This is something that we are actively thinking about because we know it should be better.

If you request a year that does not exist (2020) in a multi-year dataset (2021-2024), you should receive in return an empty dataset unless there is a single year dataset for 2020 in which case, you will receive data from the 2020 dataset.

Copy link
Contributor Author

@imrnmzri imrnmzri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated my commit with your suggestion, let me know what you think!

Comment on lines 1167 to 1169
"To include multiple years of data, you can include the years in the \"year\" parameter in the form of [Start Year, Stop Year]\n",
"To request multiple years of data, you can include the years in the \"year\" parameter in the form of `[Start Year, Stop Year]`.\n",
"\n",
"This also works in a dataset with only a single year of data"
"Date ranges can only be used for multi-year datasets"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworded the statement with your suggestion to highlight that this parameter can only be used for multi-year data

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good. I like the rewritten 2nd line.

One more thing, can you add something like " (i.e. Year Column is "MULTIPLE" in the src.datasets table or equivalently, coverage_start to coverage_end columns span multiple years)" at the end of the 2nd line"?

That isn't needed specifically for your issue, but I think it would generally be good to define what a multi-year dataset is because it is not currently done anywhere else.

After that, I'll merge the pull request.

Comment on lines 1207 to 1203
"multiyear_tbl"
"multiyear_tbl.table.groupby(\"Year\").size()"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To provide more clarity on the output, I updated this line to show the summarized content of the table using size() to help users understand the size of their data by each year, and also to validate that the year is present on the data

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good addition!

@sowdm sowdm merged commit 95ec3b9 into openpolicedata:main Apr 22, 2024
@sowdm
Copy link
Collaborator

sowdm commented Apr 23, 2024

@imrnmzri Thanks for your help on this! Your documentation update was automatically included in our "latest" documentation, which will become part of the stable documentation with the next code release which hopefully will happen in the next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants