Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PTHR24031 sum3 ! chromosome segregation PAINT_REF:24031 #1322

Closed
ValWood opened this issue Sep 24, 2015 · 29 comments
Closed

PTHR24031 sum3 ! chromosome segregation PAINT_REF:24031 #1322

ValWood opened this issue Sep 24, 2015 · 29 comments
Labels

Comments

@ValWood
Copy link
Contributor

ValWood commented Sep 24, 2015

sum3 is a translation initiation factor. Roles in translation (or possibly splicing?)

chromosome segregation is way too indirect:

PomBase SPCC1795.11 sum3 GO:0007059 PAINT_REF:24031 IEA PANTHER:PTN000619593 P translation initiation RNA helicase Sum3 moc2|slh3|ded1 protein taxon:4896 20150318 GO_Central

@pgaudet pgaudet changed the title sum3 ! chromosome segregation PAINT_REF:24031 PTHR24031 sum3 ! chromosome segregation PAINT_REF:24031 Oct 19, 2016
@pgaudet
Copy link
Contributor

pgaudet commented Oct 19, 2016

Very large family: 2800 members. Postpone this fix.

@pgaudet
Copy link
Contributor

pgaudet commented Oct 20, 2016

Removed annotations manually.

@pgaudet pgaudet closed this as completed Oct 20, 2016
@selewis
Copy link

selewis commented Oct 25, 2016

I thought you were now able to open the large families. Is this still an
issue for you?

On Thu, Oct 20, 2016 at 12:59 AM, pgaudet notifications@github.com wrote:

Very lareg family: 2800 members. Postpone this fix.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1322 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABcuEI-PQkOG8sr-g6L-jfx6c38_sUBgks5q1iJJgaJpZM4GC8D4
.

@pgaudet
Copy link
Contributor

pgaudet commented Oct 25, 2016

Yes it's still a problem. The largest Marc and I were able to open since we
switched to PAINT 2 are about 600 members.

On Tue, Oct 25, 2016 at 8:12 PM, Suzanna Lewis notifications@github.com
wrote:

I thought you were now able to open the large families. Is this still an
issue for you?

On Thu, Oct 20, 2016 at 12:59 AM, pgaudet notifications@github.com
wrote:

Very lareg family: 2800 members. Postpone this fix.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1322 (comment)
254821009>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABcuEI-PQkOG8sr-g6L-
jfx6c38_sUBgks5q1iJJgaJpZM4GC8D4>
.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#1322 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEj7UHLs38teeLbLhjBa-HF3nu_p4vxKks5q3kaUgaJpZM4GC8D4
.

@krchristie
Copy link

krchristie commented Oct 26, 2016

Opening large families doesn't seem to be a problem for me.

Earlier this summer, on 5/6/16, I opened the family below with over 1500 sequences in order to make some corrections brought to our attention by an MGI user:

PTHR24249 (1642 sequences)

  • HISTAMINE RECEPTOR-RELATED G-PROTEIN COUPLED RECEPTOR

Based on file creation dates, I think I was probably using beta 2.18 (I save new versions of PAINT into new directories rather than overwriting and it's been a while since I deleted an old version.)

I have just opened it again (using an instance of PAINT 2.25 that had previously opened another family) and it didn't take that long to open, max of 30 minutes, probably less as I wasn't paying that much attention to how long it took since I left in the background while I did other stuff.

-Karen

Note: I was not at home and based on ping times of google.com, my internet connection was not as fast or consistent as I get at home.

@krchristie
Copy link

I can also open PTHR24031. This time it was the first family I loaded into a newly launched instance of PAINT and I was at home with a fast, reliable connection. It took 7-8 minutes to load.

@cmungall
Copy link
Member

Ouch, 30 minutes!

OK, we should be able to massively reduce this.

@selewis, does the current release version of PAINT have sufficient
logging to determine the duration of each remote call? If so, Karen can
paste the logging output in this ticket and we can take it from there.

On 25 Oct 2016, at 23:32, Karen R Christie wrote:

This doesn't seem to be a problem for me.

Earlier this summer, on 5/6/16, I opened the family below in order to
make some corrections brought to our attention by an MGI user:

PTHR24249

  • HISTAMINE RECEPTOR-RELATED G-PROTEIN COUPLED RECEPTOR (1642
    sequences)

Based on the file creation dates, I think I was probably using beta
2.18 (I save new versions of PAINT into new directories rather than
overwriting and it's been a while since I deleted an old version.)

I have just opened it again (using an instance of PAINT 2.25 that had
previously opened another family) and it didn't take that long to
open, max of 30 minutes, though I wasn't paying that much attention to
how long it took since I left in the background while I did other
stuff.

-Karen

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#1322 (comment)

@krchristie
Copy link

krchristie commented Oct 26, 2016

I wasn't saying that this took 30 minutes to load, only that it loaded within 30 minutes, having taken no time points between starting it and looking at it 30 minutes later to see that it had actually loaded, in comparison to what Pascale reports that she can't load large families at all, and I selected this family because I knew I had opened it previously with a version of PAINT2.

For one I actually timed (at home on my normal, good connection), see my other comment about loading PTHR24031, which is huge with over 2800 sequences. That one took 7-8 minutes to load and was the first family loaded into a newly opened instance of PAINT, which always takes longer.

For me personally, it's not a priority to make the loading faster. The loading speed is not a show stopper for me, in contrast to some other things which cause me to lose work (geneontology/paint#15). I know that the first load is always much slower than subsequent loads of other families, so on days I plan to do PAINT, I start one loading early while I'm still reading my email. I also generally run two instances of PAINT so that when it's time to load a new family into one instance, I switch to the second one which will have finished loading in the time I was working on something in the first one.

I posted into this thread to provide additional info to try to help narrow down where the problem is occurring for Pascale and Mark.

@marcfeuermann
Copy link

Trying to open 24031 since more than 1h30. Still nothing. Data get lost somewhere in the Atlantic Ocean !!
It works fine for medium-sized families, so production is not jeopardized, but we cannot annotate very big families, and more importantly, we cannot correct or improve their annotation.

@pgaudet
Copy link
Contributor

pgaudet commented Oct 27, 2016

Hi Chris,

If there is anything you can do that would make me very happy. I managed to annotate about 10 small families this week, but I need to try to open each family about 10 times before getting one to open.

The last message on the terminal is always:
2016-10-27 15:02:55,836 INFO (LoggingStageExecutor:103) Class Taxonomy Computation took 24382 ms
IOException Unexpected end of file from server has been returned while sending and receiving information from server

(the length varies slightly at each time I try)

What is this? Could it be the taxon constraints ? I don't think it's a memory issue on my end - increasing the memory allocation in java when I launch paint (with the -Xmx command) does not help at all.

Thanks,
Pascale

@cmungall
Copy link
Member

@krchristie - sounds like you have a workaround (though this quite involved!)

@pgaudet and @marcfeuermann thanks for the data points and the log messages. I have asked for some more detailed logging (see #24). Unfortunately what we have now is not much to go on, but I'm going to make some guesses.

'Class Taxonomy Computation took 24382 ms'

This is just the output for the elk reasoner, which is called on ontology loading. It's normally not informative. However, I expect this to be done in <1s. The fact it took 24s leads to speculate that there is something not right in your memory settings.

I don't see anything on https://github.com/geneontology/paint about memory settings or minimum memory requirements, so I'm not sure what your protocol is on assigning more memory, I will try and find out more.

IOException Unexpected end of file from server has been returned while sending and receiving information from server

This is frustratingly almost infortmative! If only we knew what it was attempting to retrieve (it's nothing to do with the line above, it just happens to follow it). It can't be the ontology, as it must have that for the reasoning step. It could be either family data, or it could be a golr query... until we do #24 it's hard to tell!

@cmungall
Copy link
Member

@pgaudet Oh I just realized you said you set the memory using -Xmx, in which case my hypothesis is likely false. Can you state exactly how you start up paint? The instructions say to use the .sh file - do you edit this in place?

@cmungall
Copy link
Member

OK, I spoke to @selewis and she told me that "IOException Unexpected end of file from server has been returned while sending and receiving information from server" is coming from the PantherService. I check the code and this is definitely the case. Progress, we have a firm diagnosis!

We'll also add something to make the logging more informative, and to give more information on how long the service call was running.

It seems the issue is worst for you @pgaudet . I'm wondering if this is some network or firewall peculiarity. Do you have VPN software you can use for some tests?

@pgaudet
Copy link
Contributor

pgaudet commented Oct 27, 2016

Hi Chris,

I use the command line to do something like Java - jar Xmx####
paint-all.jar (this is by memory, I can check the exact command tomorrow
if you need the exact one.)

I started with Xmx2048 (as far as I recall) and went up to 16384 (by
increments of 2x each time), with no impact on load performance.

I hope this gives you enough information to dig a little further.

Thanks, Pascale

Le 28 oct. 2016 12:14 AM, "Chris Mungall" notifications@github.com a
écrit :

@pgaudet https://github.com/pgaudet Oh I just realized you said you set
the memory using -Xmx, in which case my hypothesis is likely false. Can
you state exactly how you start up paint? The instructions say to use the
.sh file - do you edit this in place?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1322 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEj7UDHc8_RohCqaeg__GbAokzuXUQMbks5q4SIYgaJpZM4GC8D4
.

@pgaudet
Copy link
Contributor

pgaudet commented Oct 27, 2016

Hi Chris, What VPN software should I use for what tests?

Thanks, Pascale

Le 28 oct. 2016 12:31 AM, "Chris Mungall" notifications@github.com a
écrit :

OK, I spoke to @selewis https://github.com/selewis and she told me that
"IOException Unexpected end of file from server has been returned while
sending and receiving information from server" is coming from the
PantherService. I check the code and this is definitely the case. Progress,
we have a firm diagnosis!

We'll also add something to make the logging more informative, and to give
more information on how long the service call was running.

It seems the issue is worst for you @pgaudet https://github.com/pgaudet
. I'm wondering if this is some network or firewall peculiarity. Do you
have VPN software you can use for some tests?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1322 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEj7UEbcFwNWNL-DQsj_lQApc-hTVOpYks5q4SZBgaJpZM4GC8D4
.

@cmungall
Copy link
Member

emailed u a few suggestions

also - any correlation between family size and time taken? Can you give a few example family IDs just so we can be sure we're comparing like with like?

@pgaudet
Copy link
Contributor

pgaudet commented Oct 28, 2016

Hi Chris,

Several points:

  1. I think family sizes matter; I cannot do anything beyond a few hundred
    on good days. Here are examples of families I managed to do yesterday:
    PTHR12730 72 members
    PTHR31508 23 members
    PTHR43556 28 members
    PTHR24324 45 members
    PTHR21373 73 members
    PTHR38030 16 members
    PTHR16091 36 members

Maybe with one exception, I had to try multiple times loading the family
before it worked. There doesn't seem to be 'bad' families, sometimes after
a certain number of trials it loads. But most often than not I get this IO
Exception (probably 9 times out of 10).

  1. Marc seems to have much less problems than I do. He is on a PC with 16M
    of Ram. I have a mac with 4 and a mac with 8 (the last time I compared with
    Suzi she also had 8). Perhaps 8 is just not enough ?
  2. I agree that the error "Class Taxonomy Computation took 20263 ms
    IOException Unexpected end of file from server has been returned while
    sending and receiving information from server" is not very informative, but
    it seem to me that what crashes is whatever happens *after" this class
    taxonomy computation, because although this step is slow, it does happen.
    Do you (or Suzi) know what's the next step ?
  3. I am happy to keep testing anything on my end but I really think there
    is something odd on one of the servers. For example the middle of the week
    is always better for us. Monday mornings PAINT is almost always
    unaccessible, Fridays we also often have performance issues.

Thanks, Pascale

On Fri, Oct 28, 2016 at 12:45 AM, Chris Mungall notifications@github.com
wrote:

emailed u a few suggestions

also - any correlation between family size and time taken? Can you give a
few example family IDs just so we can be sure we're comparing like with
like?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1322 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEj7UHIlzh8lZfKHkIQFP1hlfEOCk9ZPks5q4SmZgaJpZM4GC8D4
.

@pgaudet
Copy link
Contributor

pgaudet commented Oct 28, 2016

Another thing:

  1. (I now manged to load a family after 3 trials): teh step after the class
    computation according to the terminal window output is "
    INFO (IDmap:113) Identical Seq ID (UniProtKB:A8WVP2) for 2 different
    genes: Gene:CBG03894 Gene:CBG_03894"
    ... could this be the step that times out ?

Thanks

Pascale

On Fri, Oct 28, 2016 at 10:40 AM, Pascale Gaudet pgaudet1@gmail.com wrote:

Hi Chris,

Several points:

  1. I think family sizes matter; I cannot do anything beyond a few hundred
    on good days. Here are examples of families I managed to do yesterday:
    PTHR12730 72 members
    PTHR31508 23 members
    PTHR43556 28 members
    PTHR24324 45 members
    PTHR21373 73 members
    PTHR38030 16 members
    PTHR16091 36 members

Maybe with one exception, I had to try multiple times loading the family
before it worked. There doesn't seem to be 'bad' families, sometimes after
a certain number of trials it loads. But most often than not I get this IO
Exception (probably 9 times out of 10).

  1. Marc seems to have much less problems than I do. He is on a PC with 16M
    of Ram. I have a mac with 4 and a mac with 8 (the last time I compared with
    Suzi she also had 8). Perhaps 8 is just not enough ?
  2. I agree that the error "Class Taxonomy Computation took 20263 ms
    IOException Unexpected end of file from server has been returned while
    sending and receiving information from server" is not very informative, but
    it seem to me that what crashes is whatever happens *after" this class
    taxonomy computation, because although this step is slow, it does happen.
    Do you (or Suzi) know what's the next step ?
  3. I am happy to keep testing anything on my end but I really think there
    is something odd on one of the servers. For example the middle of the week
    is always better for us. Monday mornings PAINT is almost always
    unaccessible, Fridays we also often have performance issues.

Thanks, Pascale

On Fri, Oct 28, 2016 at 12:45 AM, Chris Mungall notifications@github.com
wrote:

emailed u a few suggestions

also - any correlation between family size and time taken? Can you give a
few example family IDs just so we can be sure we're comparing like with
like?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1322 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEj7UHIlzh8lZfKHkIQFP1hlfEOCk9ZPks5q4SmZgaJpZM4GC8D4
.

@pgaudet
Copy link
Contributor

pgaudet commented Oct 28, 2016

Hello again,

I seem to have found a way to get PAINT to work:
java -jar paint-all.jar -Xmx33680

This is much more memory than I thought I needed ! I also tried to double it, but I am not sure that made a difference. I cannot load 1000-members families, for example PTHR23050 crashed twice already.

Pascale

@pgaudet
Copy link
Contributor

pgaudet commented Oct 28, 2016

Last update for today: I was very happy to be able to annotate a few
families today. However at around 4 PM my time I started having problems
again opening families. I tested the connection again with similar results
as this morning.

Thanks,
Pascale

On Fri, Oct 28, 2016 at 10:46 AM, Pascale Gaudet pgaudet1@gmail.com wrote:

Another thing:

  1. (I now manged to load a family after 3 trials): teh step after the
    class computation according to the terminal window output is "
    INFO (IDmap:113) Identical Seq ID (UniProtKB:A8WVP2) for 2 different
    genes: Gene:CBG03894 Gene:CBG_03894"
    ... could this be the step that times out ?

Thanks

Pascale

On Fri, Oct 28, 2016 at 10:40 AM, Pascale Gaudet pgaudet1@gmail.com
wrote:

Hi Chris,

Several points:

  1. I think family sizes matter; I cannot do anything beyond a few hundred
    on good days. Here are examples of families I managed to do yesterday:
    PTHR12730 72 members
    PTHR31508 23 members
    PTHR43556 28 members
    PTHR24324 45 members
    PTHR21373 73 members
    PTHR38030 16 members
    PTHR16091 36 members

Maybe with one exception, I had to try multiple times loading the family
before it worked. There doesn't seem to be 'bad' families, sometimes after
a certain number of trials it loads. But most often than not I get this IO
Exception (probably 9 times out of 10).

  1. Marc seems to have much less problems than I do. He is on a PC with
    16M of Ram. I have a mac with 4 and a mac with 8 (the last time I compared
    with Suzi she also had 8). Perhaps 8 is just not enough ?
  2. I agree that the error "Class Taxonomy Computation took 20263 ms
    IOException Unexpected end of file from server has been returned while
    sending and receiving information from server" is not very informative, but
    it seem to me that what crashes is whatever happens *after" this class
    taxonomy computation, because although this step is slow, it does happen.
    Do you (or Suzi) know what's the next step ?
  3. I am happy to keep testing anything on my end but I really think there
    is something odd on one of the servers. For example the middle of the week
    is always better for us. Monday mornings PAINT is almost always
    unaccessible, Fridays we also often have performance issues.

Thanks, Pascale

On Fri, Oct 28, 2016 at 12:45 AM, Chris Mungall <notifications@github.com

wrote:

emailed u a few suggestions

also - any correlation between family size and time taken? Can you give
a few example family IDs just so we can be sure we're comparing like with
like?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1322 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEj7UHIlzh8lZfKHkIQFP1hlfEOCk9ZPks5q4SmZgaJpZM4GC8D4
.

@krchristie
Copy link

Hey Pascale,

Do you ever check the speed/consistency of your connection? When I have issues, I often check my connection using the ping command in a terminal window. I usually ping google, but have also pinged yahoo. From where I am, I usually get 20ms - 200ms with either, and slightly faster times with google, but only by a little.

ping google.com

If I try to use PAINT somewhere where my connection times get really slow (over 1000ms for sure, though high hundreds can be problematic too) or erratic (some pings with very long response times or no response at all), it just stalls. I've had this happen when I had been doing PAINT from home, and then needed to work in a coffee shop for a couple hours while one of my kids was at an appointment, and the coffee shop wifi connection was really terrible and I couldn't load anything in PAINT at all. I don't think PAINT had gone down that day either.

-Karen

@cmungall
Copy link
Member

  1. I agree that the error "Class Taxonomy Computation took 20263 ms IOException Unexpected end of file from server has been returned while sending and receiving information from server" is not very informative, but it seem to me that what crashes is whatever happens *after" this class taxonomy computation, because although this step is slow, it does happen. Do you (or Suzi) know what's the next step ?

Note these are two separate steps.

  1. "Class Taxonomy Computation took 20263" is the output from the Elk reasoner after loading the ontology. See my comments above. While this has nothing do with your timeouts, the fact that this step takes so long is a clue that something is up with your memory.
  2. After this is complete (I'm not sure if immediately after or there is another step) the panther service is called. If this does not succeed then it writes "IOException Unexpected end of file from server". I have asked @selewis for additional logging that shows exactly how long this step takes. According to @mugitty's tests, this should take no more than 5 minutes (locally). Sorry this is tedious, but can you tell us how long it takes between writing "Class Taxonomy Computation" and "IOException"?

@cmungall
Copy link
Member

@krchristie - this is helpful, thanks

I had @pgaudet do a speedtest and her connection seems fine.

It seems most likely the machine is underpowered, it's operating near it's limits when it makes the panther service call, this will increase the probability of a timeout on a 5 minute translatlantic API call.

@cmungall
Copy link
Member

@krchristie and @marcfeuermann - how much memory do you both allocate (and how much is available on your machine - I know it's 16G for you Marc)? Do you use the startup script that is available from the releases on github?

@pgaudet can you let us know whether you are using your 4G machine or your 8G Machine each time you provide a data point?

Also, are you sure this is the command you use?

java -jar paint-all.jar -Xmx33680

Note that -Xmx should precede the -jar argument, otherwise it is either ignored or treated as program-specific parameters. I believe that the above command is equivalent to

java -jar paint-all.jar

@krchristie
Copy link

@cmungall

I have 16 G memory.

I have never done anything to change the memory allocation, so I have no idea. I no longer remember how to even check this.

I generally run two copies of PAINT at the same time (and sometimes other Java apps, e.g. OBO-Edit too)

I launch command line from the terminal window with

./launchPAINT.sh

@cmungall
Copy link
Member

OK, this is interesting. launchPAINT.sh from 2.24 has this:

java -jar paint-all.jar -Xmx16384m

The exact behavior of this may be dependent on system, but I have never seen the memory specified after the jar before. On my mac this has the effect of ignoring the memory setting, and default memory will be used. I don't know if this is the same on your mac. It may be setting the memory to 16G. Or it may be ignoring it and using whatever your system default is for java.

@cmungall
Copy link
Member

@marcfeuermann

Trying to open 24031 since more than 1h30. Still nothing. Data get lost somewhere in the Atlantic Ocean !!

So it sounds like it's hanging. Can you show what is that the tail of your log when you do this?

@krchristie
Copy link

krchristie commented Oct 31, 2016

I am currently using 2.25, and my version of launchPAINT.sh (I usually just copy from previous PAINT directories) has this:

java -jar paint-all.jar -Xmx1600m

@selewis
Copy link

selewis commented Nov 15, 2016

@pgaudet and @marcfeuermann
Try the scripts with v2.27 and see how that goes. You can use your existing files (though these may be older panther trees) to relieve some of the download time.

@selewis selewis reopened this Nov 15, 2016
@selewis selewis closed this as completed Nov 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants