Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

structure of results.xml #13

Open
tarrow opened this issue Apr 29, 2016 · 1 comment
Open

structure of results.xml #13

tarrow opened this issue Apr 29, 2016 · 1 comment

Comments

@tarrow
Copy link

tarrow commented Apr 29, 2016

Just trying to get an idea of what should actually be in results.xml. Currently we turn out snippets like this:

Word Frequency

<?xml version="1.0" encoding="UTF-8"?>
<results title="frequencies">
 <result title="frequency" word="malaria" count="72"/>
</results>

Binomial Species

<?xml version="1.0" encoding="UTF-8"?>
<results title="binomial">
 <result pre=" species. In Madagascar, bimonthly treatment with the anthelmintic levamisole had no effect on " exact="Plasmodium falciparum" xpath="/html[1]/body[1]/div[2]/div[2]/div[3]/div[3]/p[1]" match="Plasmodium falciparum" post=" parasite density among children aged &amp;amp;lt;5 years but, among children aged ≥15 years, resulted " name="binomial"/>
</results>

@petermr
Copy link
Member

petermr commented Apr 29, 2016

All files should contain audit/log metadata. This could be something like:

<metadata
rundate="2016-04-29"
query="species binomial"
program="ami_0.3.1"
os="macosx.10.2"
... etc
inputSteam="..."
stemming="true"
caseSensitive="no"

On Fri, Apr 29, 2016 at 8:00 AM, tarrow notifications@github.com
wrote:Just trying to get an idea of what should actually be in results.xml.
my edits

Word Frequency

<metadata .../>

Binomial Species

<metadata
...
plugin="species" // maybe better word than species
option-"binomial"
parameters="expandAbbreviations"
dictionary="cmine.species"
dictionaryId="http://cmine...

<result pre=" species. In Madagascar, bimonthly treatment with the

anthelmintic levamisole had no effect on "

exact="Plasmodium falciparum" // the expanded string

xpath="/html[1]/body[1]/div[2]/div[2]/div[3]/div[3]/p[1]"

^^^ keep as this, not "local-name()"
maybe:

xpath="/html[1]/body[1]/div[2]/div[2]/div[3]/div[3]/p[1]/span[@Class='sentence'][3]

when we have sentences working

match="P. falciparum" // the surface match

wikidataId="225.2345" or whatever
dictionaryId="1234598" links to CMine dictionary

post=" parasite density among children aged &amp;lt;5 years but, among children aged ≥15 years, resulted "

name="binomial"/>


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#13

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants