Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine said that the package was ingested, but it did not include all of the titles #492

Closed
jhsolomon opened this issue Apr 7, 2016 · 10 comments
Assignees
Labels
Milestone

Comments

@jhsolomon
Copy link
Collaborator

In Karger: Journals Collection: Test 1 and Karger: Journals Collection: Test 2 there are 150 titles. Only 100 were ingested, but Refine did not flag this as a partial ingest.

@jhsolomon jhsolomon added the bug label Apr 7, 2016
@jhsolomon jhsolomon added this to the 6.0 milestone Apr 7, 2016
@ianibo
Copy link
Member

ianibo commented Apr 7, 2016

Looking in the log, the server thinks there were only 100 lines in the file, so it processed everything sent. If I can get the source file I'll dig.

@ianibo
Copy link
Member

ianibo commented Apr 7, 2016

Heya - when I open that file in refine, it only shows 100 titles, although there are 150 in the file - can you confirm you do/don't see something different? cheers,
e

@ianibo
Copy link
Member

ianibo commented Apr 7, 2016

Ignore me :)

@jhsolomon
Copy link
Collaborator Author

Yes, I see 150.
[image: Inline image 1]

On Thu, Apr 7, 2016 at 1:36 PM, Ian Ibbotson notifications@github.com
wrote:

Heya - when I open that file in refine, it only shows 100 titles, although
there are 150 in the file - can you confirm you do/don't see something
different? cheers,
e


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#492 (comment)

Jennifer Solomon
GOKb Editor, Acquisitions and Discovery
North Carolina State University Libraries
919-515-2743
j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

@ianibo
Copy link
Member

ianibo commented Apr 7, 2016

Ok - sensible debugging this time :) can you take a look at line 160 of that file and see if you have a line like

3-0283 full text Ceased publication S. Karger AG

In the middle of the file?

@ianibo
Copy link
Member

ianibo commented Apr 7, 2016

Seeing this in refine conversation, logged in client

19:13:24.825 [ command] Exception caught (41ms)
java.io.IOException: Cannot retrieve content from https://test-gokb.kuali.org/gokb/api/projectIngestProgress?projectID=323246&_=1460052804781
at com.k_int.gokb.refine.A_RefineAPIBridge.toAPI(A_RefineAPIBridge.java:191)
at com.k_int.gokb.refine.A_RefineAPIBridge.getFromAPI(A_RefineAPIBridge.java:91)
at com.k_int.gokb.refine.commands.GerericProxiedCommand.doGet(GerericProxiedCommand.java:33)
at com.google.refine.RefineServlet.service(RefineServlet.java:170)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Server returned HTTP response code: 504 for URL: https://test-gokb.kuali.org/gokb/api/projectIngestProgress?projectID=323246&_=1460052804781
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1627)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)
at com.k_int.gokb.refine.A_RefineAPIBridge.toAPI(A_RefineAPIBridge.java:176)
... 25 more
19:13:29.785 [ refine] GET /command/gokb/projectIngestProgress (4960ms)

@ianibo
Copy link
Member

ianibo commented Apr 7, 2016

Righty :)

At some point we have upped the batch size from 25 to 100 rows inside the cred. It looks like we forgot to update a bit of logic that adjusts for a number of rows exactly on that boundary. Because 150 / 25 fits, we were finding the error case. I've updated it and I think the Karger package is ingesting now. J - could you give it another whirl please?

@ianibo
Copy link
Member

ianibo commented Apr 7, 2016

P.S. temporarily commented out updating of user updating package via refine, will put it back in once we're sure this is fixed.

@jhsolomon
Copy link
Collaborator Author

Cool. This time all of the titles were ingested.

On Thu, Apr 7, 2016 at 3:28 PM, Ian Ibbotson notifications@github.com
wrote:

P.S. temporarily commented out updating of user updating package via
refine, will put it back in once we're sure this is fixed.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#492 (comment)

Jennifer Solomon
GOKb Editor, Acquisitions and Discovery
North Carolina State University Libraries
919-515-2743
j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

@ianibo ianibo added the TestMe label Apr 19, 2016
@jhsolomon
Copy link
Collaborator Author

fix confirmed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants