-
Notifications
You must be signed in to change notification settings - Fork 147
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Added documentation for PMC external repo - Added documentation for BioC format
- Loading branch information
Showing
6 changed files
with
119 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
16 changes: 16 additions & 0 deletions
16
...resources/META-INF/asciidoc/user-guide/external-search-repos-pubannotation.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
31 changes: 31 additions & 0 deletions
31
.../src/main/resources/META-INF/asciidoc/user-guide/external-search-repos-pmc.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
// Licensed to the Technische Universität Darmstadt under one | ||
// or more contributor license agreements. See the NOTICE file | ||
// distributed with this work for additional information | ||
// regarding copyright ownership. The Technische Universität Darmstadt | ||
// licenses this file to you under the Apache License, Version 2.0 (the | ||
// "License"); you may not use this file except in compliance | ||
// with the License. | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
[[sect_external-search-repos-pmc]] | ||
= PubMed Central | ||
|
||
==== | ||
CAUTION: Experimental feature. To use this functionality, you need to enable it first by adding `external-search.pmc.enabled=true` to the `settings.properties` file (see the <<admin-guide.adoc#sect_settings, Admin Guide>>). You should also add `format.bioc.enabled=true` to enable | ||
support for the BioC format used by this repository connector. | ||
==== | ||
|
||
link:https://www.ncbi.nlm.nih.gov/pmc/[PubMed Central]® (PMC) is a free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine (NIH/NLM). It can be added as an external document repository by | ||
selecting the **PubMed Central** repository type. | ||
|
||
NOTE: {product-name} uses the BioC version of the PMC documents for import. The search tries to | ||
consider only documents that have full text available, but the BioC version of these texts may be | ||
available only with a delay. Thus, if you cannot import a recently uploaded document from PMC into | ||
{product-name}, you may try it again a day later and have more success. |
61 changes: 61 additions & 0 deletions
61
...ption-io-bioc/src/main/resources/META-INF/asciidoc/user-guide/formats-bioc.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
// Licensed to the Technische Universität Darmstadt under one | ||
// or more contributor license agreements. See the NOTICE file | ||
// distributed with this work for additional information | ||
// regarding copyright ownership. The Technische Universität Darmstadt | ||
// licenses this file to you under the Apache License, Version 2.0 (the | ||
// "License"); you may not use this file except in compliance | ||
// with the License. | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
[[sect_formats_bioc]] | ||
= BioC (experimental) | ||
|
||
==== | ||
CAUTION: Experimental feature. To use this functionality, you need to enable it first by adding `format.bioc.enabled=true` to the `settings.properties` file (see the <<admin-guide.adoc#sect_settings, Admin Guide>>). | ||
==== | ||
|
||
This is a new and still experimental BioC format. | ||
|
||
* Sentence information is supported | ||
* If sentences are present in a BioC document, they are imported. Otherwise, {product-name} will | ||
automatically try to determine sentence boundaries. | ||
* On export, the BioC files are always created with sentence information. | ||
* Passages are imported as a `Div` annotations and the passage `type` infon is set as the `type` | ||
feature on these `Div` annotations | ||
* When reading span or relation annotations, the `type` infon is used to look up a suitable | ||
annotation layer. If a layer exists where either the full technical name of the layer or the | ||
simple technical name (the part after the last dot) match the type, then an attempt will be made | ||
to match the annotation to that layer. If the annotation has other infons that match features on | ||
that layer, they will also be matched. If no layer matches but the default `SimpleSpan` layer is | ||
present, annotations will be matched to that. Similarly, if only a single infon is present in an | ||
annotation and no other feature matches, then the infon value may be matched to a potentially | ||
existing `value` feature. | ||
* When exporting annotations, the `type` infon will always be set to the full layer name and | ||
features will be serialized to infons matching their names. | ||
* If a document has not been imported from a BioC file containing passages and does not contain | ||
`Div` annotations from any other source either, then on export a single passage containing the | ||
entire document is created. | ||
* Cross-passage relations are not supported. | ||
* Passage-level infons are not supported. | ||
* Document-level infons are not supported. | ||
|
||
|
||
[cols="2,1,1,1,3"] | ||
|==== | ||
| Format | Read | Write | Custom Layers | Description | ||
|
||
| link:https://raw.githubusercontent.com/2mh/PyBioC/master/BioC.dtd[BioC (experimental)] (`bioc`) | ||
| yes | ||
| yes | ||
| yes | ||
| BioC format | ||
|
||
|==== | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters