Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ontology of departments and advisor/student links #42

Closed
RichardLitt opened this issue Dec 12, 2014 · 5 comments
Closed

Ontology of departments and advisor/student links #42

RichardLitt opened this issue Dec 12, 2014 · 5 comments

Comments

@RichardLitt
Copy link
Member

Idea: We really need a scholarly graph or ontology of people. Can we scrape university websites to get this?

An example would be Erdős numbers, which map out connections people have to publications by Erdős through publications with collaborators.

This is an open problem in science, which has not been solved, but would be incredibly useful if we could capture it. One way of doing this would be to mine dissertations and find advisors.

@jameswweis
Copy link
Member

​Scraping lab websites could be tricky -- formats differ widely. Mining
dissertations is one approach -- another is inferring the relationships
from journal citation data. This may lead to more of an 'interaction
community,' but that may be more useful to us anyway. ​

On Fri, Dec 12, 2014 at 2:06 PM, Richard Littauer notifications@github.com
wrote:

Idea: We really need a scholarly graph or ontology of people. Can we
scrape university websites to get this?

An example would be Erdős numbers, which map out connections people have
to publications by Erdős through publications with collaborators.

This is an open problem in science, which has not been solved, but would
be incredibly useful if we could capture it. One way of doing this would be
to mine dissertations and find advisors.


Reply to this email directly or view it on GitHub
#42.

@RichardLitt
Copy link
Member Author

Yeah, scraping websites isn't necessarily feasible at scale. Inferring relationships would work better if we could get a get training set though, which may be able to be got through mining.

@adammarblestone-zz
Copy link

There are a few relevant projects among our "class" of MetaKnowledge grants:

http://www.knowledgelab.org/news/detail/1.4_million_in_grants_awarded_to_metaknowledge_projects

Stephen David: "Neurotree: Graphing the Evolution of Science Through
Mentorship Networks". Oregon Health and Science University

Neuroscience professor Stephen David runs Neurotree, an open-access website
that has tracked mentor relationships for over 40,000 neuroscientists over
the last eight years. David plans to use his grant to develop tools to
curate Neurotree’s database and link it to publication databases. These
links will help “explore how mentorship influences the emergence and
evolution of ideas, and if this information can help trainees choose
mentors,” writes David. The grant will also support the development of the
growing Academic Family Tree, which does work similar to Neurotree for
other disciplines like music composition and theology.
Because of the magnitude and difficulty of problems like name
disambiguation in this area, my feeling within Beagle would be that we
should catalyze others to solve this problem, while we focus at least
initially on the core user interface issues for scientific annotation and
sharing... for example, at the MetaKnowledge workshop I met someone, in
addition to Stephen David, who was working on this
scrape-the-web-for-scientists'-identities-and-relationships problem...

But I agree that at the "platform" level of scientific sharing tools,
something like this would be great to have down the line.

On Sat, Dec 13, 2014 at 5:11 PM, Richard Littauer notifications@github.com
wrote:

Yeah, scraping websites isn't necessarily feasible at scale. Inferring
relationships would work better if we could get a get training set though,
which may be able to be got through mining.


Reply to this email directly or view it on GitHub
#42 (comment).

@jameswweis
Copy link
Member

Interesting. Also potentially useful is the mathematics genealogy project: http://genealogy.math.ndsu.nodak.edu/

Sent via mobile

On Dec 13, 2014, at 17:31, Adam Marblestone notifications@github.com wrote:

There are a few relevant projects among our "class" of MetaKnowledge grants:

http://www.knowledgelab.org/news/detail/1.4_million_in_grants_awarded_to_metaknowledge_projects

Stephen David: "Neurotree: Graphing the Evolution of Science Through
Mentorship Networks". Oregon Health and Science University

Neuroscience professor Stephen David runs Neurotree, an open-access website
that has tracked mentor relationships for over 40,000 neuroscientists over
the last eight years. David plans to use his grant to develop tools to
curate Neurotree’s database and link it to publication databases. These
links will help “explore how mentorship influences the emergence and
evolution of ideas, and if this information can help trainees choose
mentors,” writes David. The grant will also support the development of the
growing Academic Family Tree, which does work similar to Neurotree for
other disciplines like music composition and theology.
Because of the magnitude and difficulty of problems like name
disambiguation in this area, my feeling within Beagle would be that we
should catalyze others to solve this problem, while we focus at least
initially on the core user interface issues for scientific annotation and
sharing... for example, at the MetaKnowledge workshop I met someone, in
addition to Stephen David, who was working on this
scrape-the-web-for-scientists'-identities-and-relationships problem...

But I agree that at the "platform" level of scientific sharing tools,
something like this would be great to have down the line.

On Sat, Dec 13, 2014 at 5:11 PM, Richard Littauer notifications@github.com
wrote:

Yeah, scraping websites isn't necessarily feasible at scale. Inferring
relationships would work better if we could get a get training set though,
which may be able to be got through mining.


Reply to this email directly or view it on GitHub
#42 (comment).


Reply to this email directly or view it on GitHub.

@RichardLitt
Copy link
Member Author

I agree about helping to catalyze. Alright! We should contact Stephen David to see if Neurotree has an API. Same for the Mathematics Genealogy project.

Closing this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants