Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Neural QA Model for DBpedia (GSoC 2019) #19

Open
mommi84 opened this Issue Dec 15, 2018 · 19 comments

Comments

Projects
None yet
9 participants
@mommi84
Copy link
Member

commented Dec 15, 2018

Previous projects

This project idea is a follow-up of GSoC 2018 project A Neural QA Model for DBpedia.

Description

In the last years, the Linked Data Cloud has grown to over 100 billion facts pertaining to a multitude of domains. The DBpedia knowledge base consists of 4.58 million things on its own. However, accessing this information is challenging for lay users as they are not able to use SPARQL as querying language without exhaustive training.

Recently, Deep Learning architectures based on Neural Networks called seq2seq have shown to achieve the state-of-the-art results at translating sequences into sequences. In this direction, we suggest a GSoC topic around Neural Networks to translate any natural language expression into sentences encoding SPARQL queries. Our preliminary work on Question Answering with Neural SPARQL Machines (NSpM) shows promising results but the coverage is restricted to manually-curated templates.

The most up-to-date source code can be found here. During the GSoC, we will use this repository as workspace.

Goals

In this GSoC project, the candidate can choose between the following research directions:

  1. employ a language model (e.g., Question Generation, Universal Sentence Encoders) to automatically discover query templates;
  2. perform experiments on compositionality for complex QA;

with the following ultimate goals:

  • train one or more NSpM models on DBpedia;
  • evaluate the model against either the QALD benchmark (direction 1) or a new task-oriented dataset (direction 2).

Impact

The project will allow users to access DBpedia knowledge using natural language.

Warm-up tasks

Mentors

Rricha Jalota and Nausheen Fatma (backup: Aashay Singhal, Aman Mehta, Tommaso Soru).

Keywords

structured question answering, deep learning, neural networks, sparql, tensorflow, python

@wannabeOG

This comment has been minimized.

Copy link

commented Jan 10, 2019

Great idea, I have opened up some issues which I found while running the initial tests. Would like to work on this project.

@mommi84

This comment has been minimized.

Copy link
Member Author

commented Feb 13, 2019

Hi @wannabeOG! Thanks for your interest. Please open a pull request, if you think you can fix those issues. Have you thought which research direction you would like to explore?

@yudhik11

This comment has been minimized.

Copy link

commented Feb 18, 2019

Hi
I have gone through both the papers as mentioned above which gave me good insight.
Also, after going through the blog, I was successfully able to reproduce the experiment in which I kept the param --num_train_steps = 12000 where I achieved dev bleu 88.1 and test bleu 87.3 .

Currently, I am going through the code more extensively, but it would be helpful if I can get some suggestions in respect with proceeding further in this project.

@amanmehta-maniac

This comment has been minimized.

Copy link
Contributor

commented Feb 21, 2019

Hi @yudhik11, thats a good start. I suggest you to go ahead with warm-up task number 2, that is - downloading & editing a sample template and trainimg a Neural SPARQL Machine model. Please go through the mentioned wiki while you do so.

@yudhik11

This comment has been minimized.

Copy link

commented Feb 27, 2019

I am stuck at one of the steps while following the PIPELINE.
Let's say I want to go with dbo: Continent, I tried extracting the properties as mentioned in the pipeline from links ontology, page but was not successful.

Can anyone guide me through this?

@amanmehta-maniac

This comment has been minimized.

Copy link
Contributor

commented Mar 4, 2019

The URL (for Place class ) is http://mappings.dbpedia.org/server/ontology/classes/Place. Are you done with warmup-task 2 though? I'd suggest you go through the warmup tasks in the mentioned order. You do not need PIPELINE to be able to complete your warmup task 2.

Keep me updated.

@yudhik11

This comment has been minimized.

Copy link

commented Mar 9, 2019

I have worked upon the Task-2 by training on multiple classes and by trying different total annotations.
A brief summary is cumulatively shown here.
I plan to run the experiment with more variety of classes and complex SPARQL queries.

Currently, I am starting with Task-3 which was 'reproducing the experiments.'

@Dewalade1

This comment has been minimized.

Copy link

commented Mar 14, 2019

Hi, I would like to apply for gsoc. I would like to work on this project.

@mommi84

This comment has been minimized.

Copy link
Member Author

commented Mar 20, 2019

Thanks for sharing your results @yudhik11 and thanks for your interest @Dewalade1. Have you guys already started writing your proposals? When you think your ideas are mature enough, please share a Google doc with my handle at gmail dot com. Remember to specify which research direction you may want to investigate.

@mugdhajoshi

This comment has been minimized.

Copy link

commented Mar 24, 2019

I stuck at running build_vocab.py file. The code is written in python2.7 and now I have installed python2.7 and not able to install tensorflow getting this while installing:
" Could not find a version that satisfies the requirement tensorflow (from versions: )
No matching distribution found for tensorflow"
I think tensorflow only supports python3.5 and above.
Can someone help me?

@yudhik11

This comment has been minimized.

Copy link

commented Mar 24, 2019

Always mention which OS you are using and FYI Tensorflow is supported on python2.7

If you are using ubuntu:

  • Firstly it is advised to make a virtualenv, so that you can modify/add some dependencies without affecting the whole system.
  • Once you are done with that, do:
  pip install tensorflow-gpu==1.3.0
  pip install tensorflow-tensorboard==0.1.8
@mugdhajoshi

This comment has been minimized.

Copy link

commented Mar 24, 2019

I am using Windows10 .
https://www.tensorflow.org/install/pip?lang=python2
In the link above under '2. Create a virtual environment (recommended)'
for windows it is saying "TensorFlow is not supported on Windows with Python 2.7"

@wannabeOG

This comment has been minimized.

Copy link

commented Mar 24, 2019

I haven't been in touch with Windows for some while now, but I do remember that TF was only compatible with Python 3 on Windows, This issue ("tensorflow/tensorflow#23603") makes me believe that there is no 2.7 package readily available for Windows. That being said, there are a couple of alternatives that could be used to bypass this problem. Please keep in mind that these are the alternatives that I can think of at the moment and there might be better alternatives available which do not entail the installation of any additional software.

  1. Docker installation: Download and install docker toolbox for windows https://www.docker.com/docker-toolbox. After getting that done, follow the instructions given here https://www.tensorflow.org/install/docker to set up a Python 2.7 environment and use it to get the project running
  2. Set up a virtual machine on your Windows platform itself following the instructions here ("https://itsfoss.com/install-linux-in-virtualbox/") and then follow the normal instructions for Linux. This could help in the long run as carrying out development work on Windows is really cumbersome.
@nausheenfatma

This comment has been minimized.

Copy link
Member

commented Mar 24, 2019

Hi @mugdhajoshi, if you are facing a lot of compatibility issues, you may also consider to install an Ubuntu. This link shows how to install Ubuntu on a Windows from here: https://tutorials.ubuntu.com/tutorial/tutorial-ubuntu-on-windows#0.

@mugdhajoshi

This comment has been minimized.

Copy link

commented Mar 25, 2019

Thank you @wannabeOG, @nausheenfatma for your comment.
I have successfully installed ubuntu in virtualbox.

@nausheenfatma

This comment has been minimized.

Copy link
Member

commented Mar 25, 2019

@mugdhajoshi great that you could make it work. Since the proposal deadline is approaching, you might quickly start discussing your ideas after doing the warm up tasks.

@theodore3131

This comment has been minimized.

Copy link

commented Mar 29, 2019

Hi, I am working on the warm-up tasks stated in the wiki page and the draft proposal is nearly completed, I have been following this project for a long time and I would like to contribute to this project. And I think that the README.md should state clearly that the environment is python2.7. That will save a lot of troubles and I also wanna know if it possible to upgrade it to python 3.x in this project?
Thank you for your time!

@nausheenfatma

This comment has been minimized.

Copy link
Member

commented Mar 29, 2019

@theodore3131 : That's great. It's advisable to quickly share your proposal in a Google doc through email stating your ideas for further discussion.

@rrichajalota

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

@theodore3131 Of course, it's possible to upgrade the code to python 3.x. If you are willing to do so during the GSoC timeline, then please mention it in your proposal. We would also be happy to have updated documentation for easier installation/execution of the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.