Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blob at coordinates 0,0,0 #1

Open
pbellec opened this issue Oct 20, 2014 · 9 comments
Open

blob at coordinates 0,0,0 #1

pbellec opened this issue Oct 20, 2014 · 9 comments

Comments

@pbellec
Copy link

pbellec commented Oct 20, 2014

When running queries such as "Alzheimer" (~280 papers) or "disease" (~780 papers) there is one big blob that seems to be centered at coordinates 0,0,0. It is particularly clear with the "disease" query. It looks like a bug to me, either in the data or the software.

@pbellec
Copy link
Author

pbellec commented Oct 20, 2014

The "MCI" query has a similar problem.

@r03ert0
Copy link
Owner

r03ert0 commented Oct 20, 2014

Hi Pierre,

I found the problem. It's not really a bug in brainspell, but an issue
with neurosynth's parser. There are often tables interpreted as
stereotaxic coordinates which are in fact something else. I dumped all
the coordinates for the papers that respond to search?q=alzheimer, and
there are many like this one, for example:

http://brainspell.dev/article/22169204

which have coordinates like these:

15.0 65.0 20.0
8.0 40.0 40.0
6.4 41.0 37.0
4.7 28.0 45.0
3.7 24.0 44.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0
1.0 2.0 3.0

...of course, that makes a pile of rubbish at a coordinate close to
0,0,0, and that's not the only paper I found (ex.:
http://brainspell.dev/article/19703569,
http://brainspell.dev/article/18805495)

The only solution I see for the moment is manual curation... to go and
tag all those tables as incorrect. I'm also working at implementing a
way of manually editing/correcting the tables. That could also help.

Finally, the largest the list of articles responding to a query, the
more likely is that you'll get wrong tables in the middle...

hope this helps.

best,
roberto

On Mon, Oct 20, 2014 at 9:43 PM, Pierre Bellec notifications@github.com wrote:

The "MCI" query has a similar problem.


Reply to this email directly or view it on GitHub.

@pbellec
Copy link
Author

pbellec commented Oct 20, 2014

I see. I didn't realize it wasn't possible to manually edit the table. That would most definitely be useful. And we should let Tal know about this. As I told you, we are planning a brainspell Alzheimer/MCI tag sprint here in Montreal, and we could work on fixing at least these papers.

@r03ert0
Copy link
Owner

r03ert0 commented Oct 20, 2014

the problem with manually editing the tables is how to deal with multiple
editions (the worst case being vandalism). I think that as a 1st approach
I'll just assume that people are good :)

On Mon, Oct 20, 2014 at 10:07 PM, Pierre Bellec notifications@github.com
wrote:

I see. I didn't realize it wasn't possible to manually edit the table.
That would most definitely be useful. And we should let Tal know about
this. As I told you, we are planning a brainspell Alzheimer/MCI tag sprint
here in Montreal, and we could on fixing at least these papers.


Reply to this email directly or view it on GitHub
#1 (comment).

@pbellec
Copy link
Author

pbellec commented Oct 20, 2014

Indeed. Also, it is unclear to me how you are going to keep the database
updated. Will you merge with news releases from neurosynth ? If so, how
will you deal with conflicts ?

Pierre Bellec
http://simexp-lab.org/brainwiki/doku.php?id=pierrebellec
Telephone (1) 514 713 5596
SIMEXP lab http://simexp-lab.org

On Mon, Oct 20, 2014 at 4:47 PM, Roberto Toro notifications@github.com
wrote:

the problem with manually editing the tables is how to deal with multiple
editions (the worst case being vandalism). I think that as a 1st approach
I'll just assume that people are good :)

On Mon, Oct 20, 2014 at 10:07 PM, Pierre Bellec notifications@github.com

wrote:

I see. I didn't realize it wasn't possible to manually edit the table.
That would most definitely be useful. And we should let Tal know about
this. As I told you, we are planning a brainspell Alzheimer/MCI tag
sprint
here in Montreal, and we could on fixing at least these papers.


Reply to this email directly or view it on GitHub
#1 (comment).


Reply to this email directly or view it on GitHub
https://github.com/r03ert0/brainspell/issues/1#issuecomment-59836963.[image:
Web Bug from
https://github.com/notifications/beacon/1670887__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcyOTQ1NzIyMiwiZGF0YSI6eyJpZCI6NDYzNDMwNzF9fQ==--b3185123bdb76a2ebee0fd8980849a9e23e6c5c2.gif]
{"@context":"http://schema.org","@type":"EmailMessage","description":"View
this Issue on GitHub","action":{"@type":"ViewAction","url":"
https://github.com/r03ert0/brainspell/issues/1#issuecomment-59836963","name":"View
Issue"}}

@r03ert0
Copy link
Owner

r03ert0 commented Oct 21, 2014

  • the DB is manually updated when more neurosynth data is made available
    (for the most recent papers, pubmed metadata is sometimes unavailable, so
    the DB needs frequent updates in any case)
  • ideally, neurosynth should be just one more user, and its updates
    considered as such (for many values, all votes/tags from all users are
    conserved). But for manually entered data, I think that human input should
    be given precedence over algorithmic input (such as neurosynth), conserving
    the 'flagging' mechanism to report errors. Finally, for humans overwriting
    humans or algorithms overwriting algorithms, I would apply the same
    principle as before, and assume that the last editor was correcting the
    previous one (i.e., that users are good).

On Mon, Oct 20, 2014 at 10:50 PM, Pierre Bellec notifications@github.com
wrote:

Indeed. Also, it is unclear to me how you are going to keep the database
updated. Will you merge with news releases from neurosynth ? If so, how
will you deal with conflicts ?

@pbellec
Copy link
Author

pbellec commented Oct 21, 2014

At this stage I'd indeed be surprised if you ran into trouble by assuming
"users are good". You could always add some moderation mechanism down the
road.

Pierre Bellec
http://simexp-lab.org/brainwiki/doku.php?id=pierrebellec
Telephone (1) 514 713 5596
SIMEXP lab http://simexp-lab.org

On Mon, Oct 20, 2014 at 11:27 PM, Roberto Toro notifications@github.com
wrote:

  • the DB is manually updated when more neurosynth data is made available
    (for the most recent papers, pubmed metadata is sometimes unavailable, so
    the DB needs frequent updates in any case)
  • ideally, neurosynth should be just one more user, and its updates
    considered as such (for many values, all votes/tags from all users are
    conserved). But for manually entered data, I think that human input should
    be given precedence over algorithmic input (such as neurosynth), conserving
    the 'flagging' mechanism to report errors. Finally, for humans overwriting
    humans or algorithms overwriting algorithms, I would apply the same
    principle as before, and assume that the last editor was correcting the
    previous one (i.e., that users are good).

On Mon, Oct 20, 2014 at 10:50 PM, Pierre Bellec notifications@github.com
wrote:

Indeed. Also, it is unclear to me how you are going to keep the database
updated. Will you merge with news releases from neurosynth ? If so, how
will you deal with conflicts ?


Reply to this email directly or view it on GitHub
https://github.com/r03ert0/brainspell/issues/1#issuecomment-59874076.[image:
Web Bug from
https://github.com/notifications/beacon/1670887__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcyOTQ4MTI3NCwiZGF0YSI6eyJpZCI6NDYzNDMwNzF9fQ==--a0904041a35969cb019bce5ce8112db23c9dbe91.gif]
{"@context":"http://schema.org","@type":"EmailMessage","description":"View
this Issue on GitHub","action":{"@type":"ViewAction","url":"
https://github.com/r03ert0/brainspell/issues/1#issuecomment-59874076","name":"View
Issue"}}

@r03ert0
Copy link
Owner

r03ert0 commented Aug 1, 2015

is it ok to close this issue?

@amanbadhwar
Copy link
Collaborator

In regards to this issue, I am currently adding ~30 articles to brainspell, so I will echo the sentiment that 'my human' input is given precedence over an algorithm (at least for these articles).

Cheers,
Aman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants