Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In the bill_similarity function, find the latest version of a bill to process #144

Closed
aih opened this issue Feb 2, 2021 · 1 comment
Closed
Assignees

Comments

@aih
Copy link
Collaborator

aih commented Feb 2, 2021

Each billnumber (e.g. 116hr133) may have more than one version. For calculating bill similarity, we are currently processing all of the bill versions, but only storing bill_similarity for one of them (arbitrarily).

Instead, for each bill, we should calculate bill_similarity for the version which has the most recent date. To do this, we can search Elasticsearch for the billnumber. That should retrieve all versions. Then we look in each version to find the date, which will be a string in the form: 2019-01-03. (I just added this to elastic_load.py, so it would require re-running the bills.

This information is also in the fdsys_billstatus.xml file for the bill, in the text_versions element (see below). However, getting the data from the elasticsearch index ensures that we have the version in the index to begin with.

<textVersions>
<item>
<type>Enrolled Bill</type>
<date/>
<formats>
<item>
<url>
https://www.govinfo.gov/content/pkg/BILLS-116hr133enr/xml/BILLS-116hr133enr.xml
</url>
</item>
</formats>
</item>
<item>
<type>Engrossed Amendment House</type>
<date>2020-12-21T05:00:00Z</date>
<formats>
<item>
<url>
https://www.govinfo.gov/content/pkg/BILLS-116hr133eah/xml/BILLS-116hr133eah.xml
</url>
</item>
</formats>
</item>
<item>
<type>Engrossed Amendment Senate</type>
<date>2020-01-15T05:00:00Z</date>
<formats>
<item>
<url>
https://www.govinfo.gov/content/pkg/BILLS-116hr133eas/xml/BILLS-116hr133eas.xml
</url>
</item>
</formats>
</item>
<item>
<type>Reported to Senate</type>
<date>2019-12-17T05:00:00Z</date>
<formats>
<item>
<url>
https://www.govinfo.gov/content/pkg/BILLS-116hr133rs/xml/BILLS-116hr133rs.xml
</url>
</item>
</formats>
</item>
<item>
<type>Referred in Senate</type>
<date>2019-01-11T05:00:00Z</date>
<formats>
<item>
<url>
https://www.govinfo.gov/content/pkg/BILLS-116hr133rfs/xml/BILLS-116hr133rfs.xml
</url>
</item>
</formats>
</item>
<item>
<type>Engrossed in House</type>
<date>2019-01-10T05:00:00Z</date>
<formats>
<item>
<url>
https://www.govinfo.gov/content/pkg/BILLS-116hr133eh/xml/BILLS-116hr133eh.xml
</url>
</item>
</formats>
</item>
<item>
<type>Introduced in House</type>
<date>2019-01-03T05:00:00Z</date>
<formats>
<item>
<url>
https://www.govinfo.gov/content/pkg/BILLS-116hr133ih/xml/BILLS-116hr133ih.xml
</url>
</item>
</formats>
</item>
</textVersions>
@aih
Copy link
Collaborator Author

aih commented Feb 5, 2021

Closed with #149

@aih aih closed this as completed Feb 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants