Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locations endpoint throwing 500 errors #24

Closed
crew102 opened this issue Mar 9, 2017 · 9 comments
Closed

Locations endpoint throwing 500 errors #24

crew102 opened this issue Mar 9, 2017 · 9 comments

Comments

@crew102
Copy link

crew102 commented Mar 9, 2017

Hi -

For each endpoint except for locations, I'm able to request all of the possible fields and I get the data I would expect. When I request all of the retrievable fields for the locations endpoint, I get either a 400 or 500 error. For example, see the example request below, which resulted in a 400 error. Do you know why this would be the case? (note, we know why the 400 errors are happening, and this is not a separate issue: #29 )

Thanks,
Chris

POST /api/locations/query HTTP/1.1
Host: www.patentsview.org
Content-Type: application/x-www-form-urlencoded
user_agent: chriscrewbaker@gmail.com
Cache-Control: no-cache
Postman-Token: eb64a699-7996-5964-2b33-f630879b243a

{"q":{"patent_number":"5116621"},"f":["app_country","app_date","app_number","app_type","appcit_app_number","appcit_category","appcit_date","appcit_kind","appcit_sequence","assignee_first_name","assignee_first_seen_date","assignee_id","assignee_last_name","assignee_last_seen_date","assignee_lastknown_city","assignee_lastknown_country","assignee_lastknown_latitude","assignee_lastknown_location_id","assignee_lastknown_longitude","assignee_lastknown_state","assignee_num_patents_for_location","assignee_organization","assignee_total_num_inventors","assignee_total_num_patents","assignee_type","cited_patent_category","cited_patent_date","cited_patent_kind","cited_patent_number","cited_patent_sequence","cited_patent_title","citedby_patent_category","citedby_patent_date","citedby_patent_kind","citedby_patent_number","citedby_patent_title","cpc_category","cpc_first_seen_date","cpc_group_id","cpc_group_title","cpc_last_seen_date","cpc_num_patents_for_location","cpc_section_id","cpc_sequence","cpc_subgroup_id","cpc_subgroup_title","cpc_subsection_id","cpc_subsection_title","cpc_total_num_assignees","cpc_total_num_inventors","cpc_total_num_patents","govint_contract_award_number","govint_org_id","govint_org_level_one","govint_org_level_three","govint_org_level_two","govint_org_name","govint_raw_statement","inventor_first_name","inventor_first_seen_date","inventor_id","inventor_last_name","inventor_last_seen_date","inventor_lastknown_city","inventor_lastknown_country","inventor_lastknown_latitude","inventor_lastknown_location_id","inventor_lastknown_longitude","inventor_lastknown_state","inventor_num_patents_for_location","inventor_total_num_patents","ipc_action_date","ipc_class","ipc_classification_data_source","ipc_classification_value","ipc_first_seen_date","ipc_last_seen_date","ipc_main_group","ipc_section","ipc_sequence","ipc_subclass","ipc_subgroup","ipc_symbol_position","ipc_total_num_assignees","ipc_total_num_inventors","ipc_version_indicator","location_city","location_country","location_id","location_key_id","location_latitude","location_longitude","location_state","location_total_num_assignees","location_total_num_inventors","location_total_num_patents","nber_category_id","nber_category_title","nber_first_seen_date","nber_last_seen_date","nber_num_patents_for_location","nber_subcategory_id","nber_subcategory_title","nber_total_num_assignees","nber_total_num_inventors","nber_total_num_patents","patent_abstract","patent_average_processing_time","patent_date","patent_firstnamed_assignee_city","patent_firstnamed_assignee_country","patent_firstnamed_assignee_id","patent_firstnamed_assignee_latitude","patent_firstnamed_assignee_location_id","patent_firstnamed_assignee_longitude","patent_firstnamed_assignee_state","patent_firstnamed_inventor_city","patent_firstnamed_inventor_country","patent_firstnamed_inventor_id","patent_firstnamed_inventor_latitude","patent_firstnamed_inventor_location_id","patent_firstnamed_inventor_longitude","patent_firstnamed_inventor_state","patent_id","patent_kind","patent_num_cited_by_us_patents","patent_num_cited_by_us_patents_for_location","patent_num_claims","patent_num_combined_citations","patent_num_foreign_citations","patent_num_us_application_citations","patent_num_us_patent_citations","patent_number","patent_processing_time","patent_title","patent_type","patent_year","rawinventor_first_name","rawinventor_last_name","uspc_first_seen_date","uspc_last_seen_date","uspc_mainclass_id","uspc_mainclass_title","uspc_num_patents_for_location","uspc_sequence","uspc_subclass_id","uspc_subclass_title","uspc_total_num_assignees","uspc_total_num_inventors","uspc_total_num_patents","wipo_field_id","wipo_field_title","wipo_sector_title","wipo_sequence"],"o":{"include_subentity_total_counts":false,"matched_subentities_only":true,"page":1,"per_page":25},"s":{}}
@crew102 crew102 changed the title Inventors endpoint throwing 400 and 500 errors Locations endpoint throwing 400 and 500 errors Mar 9, 2017
@mustberuss
Copy link

@crew102 FWIW the presence of cpc_sequence is the cause of the 400 error. X-Status-Reason: Invalid field specified: cpc_sequence and setting the matched_subentities_only to true causes the 500 error. The query works using all the fields on the location's api page except for cpc_sequence when matched_subentities_only=false.

I hope this helps narrow down these problems in the api.

-> POST /api/locations/query HTTP/1.1
-> Host: www.patentsview.org
-> User-Agent: https://github.com/ropensci/patentsview
-> Accept-Encoding: gzip, deflate
-> Accept: application/json, text/xml, application/xml, */*
-> Content-Length: 4393
-> >> {"q":{"patent_number":"5116621"},"f":["app_country","app_date","app_number","app_type","appcit_app_number","appcit_category","appcit_date","appcit_kind","appcit_sequence","assignee_first_name","assignee_first_seen_date","assignee_id","assignee_last_name","assignee_last_seen_date","assignee_lastknown_city","assignee_lastknown_country","assignee_lastknown_latitude","assignee_lastknown_location_id","assignee_lastknown_longitude","assignee_lastknown_state","assignee_num_patents_for_location","assignee_organization","assignee_total_num_inventors","assignee_total_num_patents","assignee_type","cited_patent_category","cited_patent_date","cited_patent_kind","cited_patent_number","cited_patent_sequence","cited_patent_title","citedby_patent_category","citedby_patent_date","citedby_patent_kind","citedby_patent_number","citedby_patent_title","cpc_category","cpc_first_seen_date","cpc_group_id","cpc_group_title","cpc_last_seen_date","cpc_num_patents_for_location","cpc_section_id","cpc_subgroup_id","cpc_subgroup_title","cpc_subsection_id","cpc_subsection_title","cpc_total_num_assignees","cpc_total_num_inventors","cpc_total_num_patents","detail_desc_length","examiner_first_name","examiner_group","examiner_id","examiner_last_name","examiner_role","forprior_country","forprior_date","forprior_docnumber","forprior_kind","forprior_sequence","govint_contract_award_number","govint_org_id","govint_org_level_one","govint_org_level_three","govint_org_level_two","govint_org_name","govint_raw_statement","inventor_first_name","inventor_first_seen_date","inventor_id","inventor_last_name","inventor_last_seen_date","inventor_lastknown_city","inventor_lastknown_country","inventor_lastknown_latitude","inventor_lastknown_location_id","inventor_lastknown_longitude","inventor_lastknown_state","inventor_num_patents_for_location","inventor_total_num_patents","ipc_action_date","ipc_class","ipc_classification_data_source","ipc_classification_value","ipc_first_seen_date","ipc_last_seen_date","ipc_main_group","ipc_section","ipc_sequence","ipc_subclass","ipc_subgroup","ipc_symbol_position","ipc_total_num_assignees","ipc_total_num_inventors","ipc_version_indicator","lawyer_first_name","lawyer_first_seen_date","lawyer_id","lawyer_last_name","lawyer_last_seen_date","lawyer_organization","lawyer_sequence","lawyer_total_num_assignees","lawyer_total_num_inventors","lawyer_total_num_patents","location_city","location_country","location_county","location_county_fips","location_id","location_key_id","location_latitude","location_longitude","location_state","location_state_fips","location_total_num_assignees","location_total_num_inventors","location_total_num_patents","nber_category_id","nber_category_title","nber_first_seen_date","nber_last_seen_date","nber_num_patents_for_location","nber_subcategory_id","nber_subcategory_title","nber_total_num_assignees","nber_total_num_inventors","nber_total_num_patents","patent_abstract","patent_average_processing_time","patent_date","patent_firstnamed_assignee_city","patent_firstnamed_assignee_country","patent_firstnamed_assignee_id","patent_firstnamed_assignee_latitude","patent_firstnamed_assignee_location_id","patent_firstnamed_assignee_longitude","patent_firstnamed_assignee_state","patent_firstnamed_inventor_city","patent_firstnamed_inventor_country","patent_firstnamed_inventor_id","patent_firstnamed_inventor_latitude","patent_firstnamed_inventor_location_id","patent_firstnamed_inventor_longitude","patent_firstnamed_inventor_state","patent_id","patent_kind","patent_num_cited_by_us_patents","patent_num_cited_by_us_patents_for_location","patent_num_claims","patent_num_combined_citations","patent_num_foreign_citations","patent_num_us_application_citations","patent_num_us_patent_citations","patent_number","patent_processing_time","patent_title","patent_type","patent_year","pct_102_date","pct_371_date","pct_date","pct_docnumber","pct_doctype","pct_kind","rawinventor_first_name","rawinventor_last_name","uspc_first_seen_date","uspc_last_seen_date","uspc_mainclass_id","uspc_mainclass_title","uspc_num_patents_for_location","uspc_sequence","uspc_subclass_id","uspc_subclass_title","uspc_total_num_assignees","uspc_total_num_inventors","uspc_total_num_patents","wipo_field_id","wipo_field_title","wipo_sector_title","wipo_sequence"],"o":{"include_subentity_total_counts":false,"matched_subentities_only":false,"page":1,"per_page":25},"s":{}}

<- HTTP/1.1 200 OK

@mustberuss
Copy link

I wrote a scritp to incrementally add location fields to the query. I found that with matched_subentities_only=true the assignee, cpc, nbr, raw fields and uspc fields shown below caused 500 errors. Below the list is the largest subset of fields that worked, adding any more caused a 500 error. @crew102 Perhaps these fields could be unmapped from the R package until the api is fixed for this endoint?

   # 400 invalid field specified: cpc_sequence  others caused 500 errors
   troublesome = c("cpc_sequence", "assignee_first_name","assignee_first_seen_date","assignee_id","assignee_last_name","assignee_last_seen_date","assignee_lastknown_city","assignee_lastknown_country","assignee_lastknown_latitude","assignee_lastknown_location_id","assignee_lastknown_longitude","assignee_lastknown_state","assignee_num_patents_for_location","assignee_organization","assignee_total_num_inventors","assignee_total_num_patents","assignee_type","cpc_category","cpc_first_seen_date","cpc_group_id","cpc_group_title","cpc_last_seen_date","cpc_num_patents_for_location","cpc_section_id","cpc_subgroup_id","cpc_total_num_assignees","cpc_subgroup_title","cpc_subsection_id","cpc_subsection_title","cpc_total_num_inventors","cpc_total_num_patents","nber_category_id","nber_category_title","nber_first_seen_date","nber_last_seen_date","nber_num_patents_for_location","nber_subcategory_id","nber_subcategory_title","nber_total_num_assignees","nber_total_num_inventors","nber_total_num_patents","rawinventor_first_name","rawinventor_last_name","uspc_first_seen_date","uspc_last_seen_date","uspc_mainclass_id","uspc_mainclass_title","uspc_num_patents_for_location","uspc_sequence","uspc_subclass_id","uspc_subclass_title","uspc_total_num_assignees","uspc_total_num_inventors","uspc_total_num_patents")
-> POST /api/locations/query HTTP/1.1-> Host: www.patentsview.org
-> User-Agent: https://github.com/ropensci/patentsview
-> Accept-Encoding: gzip, deflate-> Accept: application/json, text/xml, application/xml, */*
-> Content-Length: 3124->
 >> {"q":{"patent_number":"5116621"},"f":["app_country","app_date","app_number","app_type","appcit_app_number","appcit_category","appcit_date","appcit_kind","appcit_sequence","cited_patent_category","cited_patent_date","cited_patent_kind","cited_patent_number","cited_patent_sequence","cited_patent_title","citedby_patent_category","citedby_patent_date","citedby_patent_kind","citedby_patent_number","citedby_patent_title","detail_desc_length","examiner_first_name","examiner_group","examiner_id","examiner_last_name","examiner_role","forprior_country","forprior_date","forprior_docnumber","forprior_kind","forprior_sequence","govint_contract_award_number","govint_org_id","govint_org_level_one","govint_org_level_three","govint_org_level_two","govint_org_name","govint_raw_statement","inventor_first_name","inventor_first_seen_date","inventor_id","inventor_last_name","inventor_last_seen_date","inventor_lastknown_city","inventor_lastknown_country","inventor_lastknown_latitude","inventor_lastknown_location_id","inventor_lastknown_longitude","inventor_lastknown_state","inventor_num_patents_for_location","inventor_total_num_patents","ipc_action_date","ipc_class","ipc_classification_data_source","ipc_classification_value","ipc_first_seen_date","ipc_last_seen_date","ipc_main_group","ipc_section","ipc_sequence","ipc_subclass","ipc_subgroup","ipc_symbol_position","ipc_total_num_assignees","ipc_total_num_inventors","ipc_version_indicator","lawyer_first_name","lawyer_first_seen_date","lawyer_id","lawyer_last_name","lawyer_last_seen_date","lawyer_organization","lawyer_sequence","lawyer_total_num_assignees","lawyer_total_num_inventors","lawyer_total_num_patents","location_city","location_country","location_county","location_county_fips","location_id","location_key_id","location_latitude","location_longitude","location_state","location_state_fips","location_total_num_assignees","location_total_num_inventors","location_total_num_patents","patent_abstract","patent_average_processing_time","patent_date","patent_firstnamed_assignee_city","patent_firstnamed_assignee_country","patent_firstnamed_assignee_id","patent_firstnamed_assignee_latitude","patent_firstnamed_assignee_location_id","patent_firstnamed_assignee_longitude","patent_firstnamed_assignee_state","patent_firstnamed_inventor_city","patent_firstnamed_inventor_country","patent_firstnamed_inventor_id","patent_firstnamed_inventor_latitude","patent_firstnamed_inventor_location_id","patent_firstnamed_inventor_longitude","patent_firstnamed_inventor_state","patent_id","patent_kind","patent_num_cited_by_us_patents","patent_num_cited_by_us_patents_for_location","patent_num_claims","patent_num_combined_citations","patent_num_foreign_citations","patent_num_us_application_citations","patent_num_us_patent_citations","patent_number","patent_processing_time","patent_title","patent_type","patent_year","pct_102_date","pct_371_date","pct_date","pct_docnumber","pct_doctype","pct_kind","wipo_field_id","wipo_field_title","wipo_sector_title","wipo_sequence"],"o":{"include_subentity_total_counts":false,"matched_subentities_only":true,"page":1,"per_page":25},"s":{}}

<- HTTP/1.1 200 OK

@crew102
Copy link
Author

crew102 commented Dec 27, 2017

Hi @mustberuss, I suspect that setting matched_subentities_only=true means that the API doesn't actually have to worry about returning some of the fields that you identified in the "largest subset," due to the fact that there aren't any matching subentries to return (at least for patent number 5116621), though I could be wrong about this. Regarding unmapping the fields in the patentsview r package, are you suggesting that, for example, patentsview::get_fields(endpoint = "locations") returns only those fields that are in the "largest working subset" that you identified? If so, I think I would prefer that the package implement the following instead:

If both of these are true:

  1. The user has included at least one "problematic field" in their query to the to the locations endpoint
  • problematic field = any field that has been resulting in the unexpected errors that I mentioned in my original post on this issue - i.e., the 400 or 500 errors that are occurring for the locations endpoint. A non-problematic field is a queryable field for the locations endpoint that doesn't throw an error regardless of the value of matched_subentities_only
  1. The HTTP request actually results in a 400 or 500 error.

..Then search_pv() should throw a custom error message that explains to the user that some queryable fields for the locations endpoint are actually not working at the moment, and then lists the subentity groups that have been problematic.

Let's move any further discussion into a new issue at ropensci/patentsview

@mustberuss
Copy link

i.e., the 400 or 500 errors that are occurring for the inventors endpoint.

@crew102 just to clarify, I believe you mean the locations endpoint

@mustberuss
Copy link

In looking at this further, I don't believe there are any troublesome fields. It's the number of groups present that seems to be the problem. With the exception of cpc_sequence's 400, I can query for each group's fields in individual calls to the location endpoint. I now think the complexity of the sql in entitySpecs.php for $LOCATION_ENTITY_SPECS is the problem when multiple groups are present. I get back this X-Status-Reason header with the 500.

<- HTTP/1.1 500 Internal Server Error
<- X-Status-Reason: Query execution failed.

I'm wondering if the call times out and throws the 500 when too many groups are present. As @crew102 mentioned originally, for other endpoints he can request every possible field successfully in a single api call. It's just the locations endpoint that has this problem.

Successful individual request:

"f":["appcit_app_number","appcit_category","appcit_date","appcit_kind","appcit_sequence"]
"f":["app_country","app_date","app_number","app_type"]
"f":["cited_patent_category","cited_patent_date","cited_patent_kind","cited_patent_number","cited_patent_sequence","cited_patent_title"]
"f":["citedby_patent_category","citedby_patent_date","citedby_patent_kind","citedby_patent_number","citedby_patent_title"]
"f":["cpc_category","cpc_first_seen_date","cpc_group_id","cpc_group_title","cpc_last_seen_date","cpc_num_patents_for_location","cpc_section_id","cpc_subgroup_id","cpc_subgroup_title","cpc_subsection_id","cpc_subsection_title","cpc_total_num_assignees","cpc_total_num_inventors","cpc_total_num_patents"]
"f":["examiner_first_name","examiner_group","examiner_id","examiner_last_name","examiner_role"]
"f":["forprior_country","forprior_date","forprior_docnumber","forprior_kind","forprior_sequence"]
"f":["govint_contract_award_number","govint_org_id","govint_org_level_one","govint_org_level_three","govint_org_level_two","govint_org_name","govint_raw_statement"]
"f":["inventor_first_name","inventor_first_seen_date","inventor_id","inventor_last_name","inventor_last_seen_date","inventor_lastknown_city","inventor_lastknown_country","inventor_lastknown_latitude","inventor_lastknown_location_id","inventor_lastknown_longitude","inventor_lastknown_state","inventor_num_patents_for_location","inventor_total_num_patents"]
"f":["ipc_action_date","ipc_class","ipc_classification_data_source","ipc_classification_value","ipc_first_seen_date","ipc_last_seen_date","ipc_main_group","ipc_section","ipc_sequence","ipc_subclass","ipc_subgroup","ipc_symbol_position","ipc_total_num_assignees","ipc_total_num_inventors","ipc_version_indicator"]
"f":["lawyer_first_name","lawyer_first_seen_date","lawyer_id","lawyer_last_name","lawyer_last_seen_date","lawyer_organization","lawyer_sequence","lawyer_total_num_assignees","lawyer_total_num_inventors","lawyer_total_num_patents"]
"f":["location_city","location_country","location_county","location_county_fips","location_id","location_key_id","location_latitude","location_longitude","location_state","location_state_fips","location_total_num_assignees","location_total_num_inventors","location_total_num_patents"]
"f":["nber_category_id","nber_category_title","nber_first_seen_date","nber_last_seen_date","nber_num_patents_for_location","nber_subcategory_id","nber_subcategory_title","nber_total_num_assignees","nber_total_num_inventors","nber_total_num_patents"]
"f":["detail_desc_length","patent_abstract","patent_average_processing_time","patent_date","patent_firstnamed_assignee_city","patent_firstnamed_assignee_country","patent_firstnamed_assignee_id","patent_firstnamed_assignee_latitude","patent_firstnamed_assignee_location_id","patent_firstnamed_assignee_longitude","patent_firstnamed_assignee_state","patent_firstnamed_inventor_city","patent_firstnamed_inventor_country","patent_firstnamed_inventor_id","patent_firstnamed_inventor_latitude","patent_firstnamed_inventor_location_id","patent_firstnamed_inventor_longitude","patent_firstnamed_inventor_state","patent_id","patent_kind","patent_num_cited_by_us_patents","patent_num_cited_by_us_patents_for_location","patent_num_claims","patent_num_combined_citations","patent_num_foreign_citations","patent_num_us_application_citations","patent_num_us_patent_citations","patent_number","patent_processing_time","patent_title","patent_type","patent_year"]
"f":["pct_102_date","pct_371_date","pct_date","pct_docnumber","pct_doctype","pct_kind"]
"f":["rawinventor_first_name","rawinventor_last_name"]
"f":["uspc_first_seen_date","uspc_last_seen_date","uspc_mainclass_id","uspc_mainclass_title","uspc_num_patents_for_location","uspc_sequence","uspc_subclass_id","uspc_subclass_title","uspc_total_num_assignees","uspc_total_num_inventors","uspc_total_num_patents"]
"f":["wipo_field_id","wipo_field_title","wipo_sector_title","wipo_sequence"]

with credit to @crew102 for suggesting we look at the (now vindicated) troublesome groups.

@crew102 crew102 changed the title Locations endpoint throwing 400 and 500 errors Locations endpoint throwing 500 errors Dec 31, 2017
@sarahkelley
Copy link

Thanks for debugging this so thoroughly, we've added this to our issues tracker and will follow up when the problem is resolved!

@mustberuss
Copy link

Now I'm seeing a 500 error thrown on three of the endpoints (cpc_subsections, nber_subcategories, uspc_mainclasses) when matched_subentities_only is false and all possible fields are requested.

library(patentsview)

  z <- lapply(get_endpoints(), function(x) {

  tryCatch( 
  { 
     search_pv(
      '{"patent_number":"5116621"}',
      endpoint = x,
      mtchd_subent_only = FALSE,
      fields = get_fields(x)
     )
     print(paste("success",x))
  }, 
  error = function(e) {print(paste("error",x,e))}
  )
})
[1] "success assignees"
[1] "error cpc_subsections Error in xheader_er_or_status(resp): Internal Server Error (HTTP 500).\n"
[1] "success inventors"
[1] "success locations"
[1] "error nber_subcategories Error in xheader_er_or_status(resp): Internal Server Error (HTTP 500).\n"
[1] "success patents"
[1] "error uspc_mainclasses Error in xheader_er_or_status(resp): Internal Server Error (HTTP 500).\n"

If I set mtchd_subent_only=TRUE then only the locations endpoint throws a 500 as we saw before.

@sarahkelley
Copy link

Thanks for pointing this out -- this is definitely an issue and we are looking into it! We'll get back to you once we've fixed the issue!

@sarahkelley
Copy link

I've opened issue #39 to describe the problem (that the API in these cases is trying to return more information than it is able to handle) and a few work arounds you can use for now . I'm going to close this more specific issue in favor of the more general issue #39.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants