Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gazetteer list as a feature #24

Open
My-Khan opened this issue May 30, 2016 · 6 comments
Open

Gazetteer list as a feature #24

My-Khan opened this issue May 30, 2016 · 6 comments

Comments

@My-Khan
Copy link

My-Khan commented May 30, 2016

Though RNN support the word embedding feature which is very plus point of RNN compared to the competitor CRF. is RNN have the capability to support external Gazetteer list and dictionaries as feature?

@zhongkaifu
Copy link
Owner

Yes. RNNSharp supports it. It's called TFeature (template feature). README file describes how it work and how to use it in details.

@My-Khan
Copy link
Author

My-Khan commented May 30, 2016

Thanks for prompt reply, i read the mentioned source but i am still confused. Actually besides the training data i have a separate text file which contain Countries name stored in text file. so how this separate file can be used as feature along with training data for learning through RNNShrap.
e.g my training file named "mytrain" contains data in following format.
کو PSP NOR
بھارت PNN S_LOCATION
سے PSP NOR
تعلق NN NOR
رکھنے VBI NOR

The gazetteer list name " MyConList"
contains data in the following format.
1: PAKISTAN
2: INDIA
3: CHINA
4: USA

my template file contains the following templates
U01:%x[-1,0]
U02:%x[0,0]

so during training the mention template will generate features from only the training file named "mytrain"
so please guide that how to use the separate file or the gazetteer list named "MyConList" in training of RNN.
Thanks in advance

@zhongkaifu
Copy link
Owner

You could read [Template Features] section in README file. It has an example about how to use template features. In RNNSharp, template features are binarized by TFatureBin.exe, and then RNNSharp uses it.

@My-Khan
Copy link
Author

My-Khan commented May 31, 2016

apology in advance. still not clear, perhaps i am not explaining my problem well. i have no problem with template feature i can generate it easily by using the TFeatureBin.exe . following are the steps which i follow:
For template feature generation from the following data stored in file named "mytrain.txt" i use TFatureBin.exe build mode.

! PUN S
Tokyo NNP S_LOCATION
and CC S
New NNP B_LOCATION
York NNP E_LOCATION
are VBP S
major JJ S

After executing the Tfeature.exe it generates Two files named tfeature.template and tfeature. right
i mention the output file in config file to be used by RNNShrap e.g TFEATURE_FILENAME:tfeatures.
ok its work well.
in above steps i used only one file named "myTrain.txt" to generate template feature, in case if i have another file or gazetteer named " "myConList.txt" contains data in the following format. then how template feature will be generated from both files using TFeature.exe
1: PAKISTAN
2: INDIA
3: CHINA
4: USA

@bratao
Copy link

bratao commented May 31, 2016

@My-Khan , You need to create a script yourself for injecting this kind of feature in your "mytrain.txt"

@My-Khan
Copy link
Author

My-Khan commented May 31, 2016

@bratao Many thanks for guidance..Hmmmm , now this become nutshell for me. if some body can help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants