Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about additions/deletions? #6

Open
jeremybmerrill opened this issue Feb 14, 2019 · 4 comments
Open

What about additions/deletions? #6

jeremybmerrill opened this issue Feb 14, 2019 · 4 comments

Comments

@jeremybmerrill
Copy link
Contributor

@jeremybmerrill jeremybmerrill commented Feb 14, 2019

e.g. https://legis.delaware.gov/json/BillDetail/GenerateHtmlDocument?legislationId=26010&legislationTypeId=1&docTypeId=2&legislationName=HB248ocd-bill/5793a6bb-6dcd-496d-b970-419568789932 has strikethrough'ed text (by CSS class, frustratingly) and underlined text to indicate deletions and additions, respectively. What do we want to extract? I'm thinking we would exclude the deletions from our conception of "bill text", but include the existing law (for context, so the result is semi-readable) and the additions. What do you all think?

@jeremybmerrill

This comment has been minimized.

Copy link
Contributor Author

@jeremybmerrill jeremybmerrill commented Feb 14, 2019

@jeremybmerrill

This comment has been minimized.

Copy link
Contributor Author

@jeremybmerrill jeremybmerrill commented Feb 14, 2019

It would increase complexity, but another options is we could output one text file that includes the deletions in the bill text and another that excludes them.

I can see us wanting to include the deletion text if this is, for instance, feeding ElasticSearch. A bill that deletes a word from statute should still come up in a search, for instance. But for machine-learning purposes, it might not...

@jeremybmerrill

This comment has been minimized.

Copy link
Contributor Author

@jeremybmerrill jeremybmerrill commented Feb 15, 2019

In some cases the formatting may be amended bill versions.

In other cases it may be a "diff" of a statute.

We may want to treat these differently.

@jamesturk

This comment has been minimized.

Copy link
Member

@jamesturk jamesturk commented Dec 12, 2019

This is still unresolved, since in most cases it will result in a good amount of added complexity. I'm all for tackling this but don't have the bandwidth/need for now. If someone comes along & does, I'd be more than glad to discuss options!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.