-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[old datahub][m]: copied over DBLP, YAGO and open corporates datasets…
… from old datahub - refs datahubio/datahub-v2-pm#214 Also refs #29
- Loading branch information
1 parent
963e928
commit a74198b
Showing
4 changed files
with
94 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
--- | ||
title: Bibliographic data | ||
description: Existing databases or services providing substantial bibliographic data | ||
keywords: DBLP data | ||
date: 2018-07-17 | ||
modified: 2018-07-17 | ||
--- | ||
|
||
## The DBLP Computer Science Bibliography | ||
|
||
The DBLP computer science bibliography contains the metadata of over 1.8 million publications, written by over 1 million authors in several thousands of journals or conference proceedings series. | ||
|
||
Although DBLP started with a focus on database systems and logic programming (hence the acronym), it has grown to cover all disciplines of computer science. | ||
|
||
### Data | ||
|
||
Resources list the full dump of the DBLP XML records (see http://dblp.uni-trier.de/xml/ - a [simple DTD](http://dblp.uni-trier.de/db/about/dblp.dtd) is available. | ||
|
||
The paper "[DBLP - Some Lessons Learned](http://dblp.uni-trier.de/xml/docu/dblpxml.pdf)" documents technical details of this XML file. In the appendix ["DBLP XML Requests"][paper] you may find the description of a primitive DBLP API. | ||
|
||
[paper]: http://dblp.uni-trier.de/xml/docu/dblpxml.pdf | ||
|
||
### Openness: OPEN | ||
|
||
As of 2011-12-09 this data is open (relased under ODC-By). See the license information in the [Readme.txt](http://dblp.uni-trier.de/xml/README.txt) and the announce post: http://openbiblio.net/2011/12/09/dblp-releases-its-1-8-million-bibliographic-records-as-open-data/ | ||
|
||
### Data and Resources | ||
|
||
* [DBLP XML records (Full dump in xml (gzipped))](http://dblp.uni-trier.de/xml/dblp.xml.gz) | ||
* [DBLP DTD - The XML file references this DTD.](http://dblp.uni-trier.de/xml/dblp.dtd) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
--- | ||
title: OpenCorporates - The Open Database Of The Corporate World | ||
description: Open Database of corporate entities. | ||
keywords: access-nobulk, corporations, database, ecommerce, format-rdf, government, lod, lodcloud-diagram-2011-09-19, no-deref-vocab, opendatachallenge, published-by-third-party, scraped, size.xlarge | ||
date: 2018-07-17 | ||
modified: 2018-07-17 | ||
--- | ||
|
||
Open Database of corporate entities. As of 2011-04-09 has information on 7,841,828 companies from around the world. Jurisdictions covered include: | ||
|
||
* 41,292 Bermuda | ||
* 3,886,733 United Kingdom | ||
* 96,104 Gibraltar | ||
* 105,640 Isle of Man | ||
* 77,693 Iceland | ||
* 60,827 Jersey | ||
* 92,795 Luxembourg | ||
* 2,188,873 Netherlands | ||
* 97,653 Alaska (US) | ||
* 197,798 District of Columbia (US) | ||
* 996,420 Michigan (US) | ||
|
||
There is good API access but currently but no bulk availability. | ||
|
||
## License | ||
|
||
See https://opencorporates.com/info/licence. However, should note that most data in OpenCorporates is scraped from elsewhere so this license only covers the 'IP' that OpenCorporates has obtained as a result of their efforts (and license of original databases, e.g. Companies House in the UK, is unclear). | ||
|
||
## Data and Resources | ||
|
||
* [Example JSON record from the API (for Google)](http://opencorporates.com/companies/gb/03977902.json) | ||
* [Example RDF record](http://opencorporates.com/companies/us_ak/124437.rdf) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
--- | ||
title: The DBLP Computer Science Bibliography | ||
description: YAGO3 is a huge semantic knowledge base, derived from Wikipedia WordNet and GeoNames. | ||
keywords: YAGO, ckanupload.esw.200910, crossdomain, format-rdf, linkeddata, lod, lodcloud-diagram-2011-09-19, lodcloud-diagram-2014-08-30, no-deref-vocab, no-license-metadata, no-provenance-metadata, ontology, published-by-producer | ||
date: 2018-07-17 | ||
modified: 2018-07-17 | ||
--- | ||
|
||
YAGO3 is a huge semantic knowledge base, derived from Wikipedia WordNet and GeoNames. Currently, YAGO3 has knowledge of more than 10 million entities (like persons, organizations, cities, etc.) and contains more than 120 million facts about these entities. | ||
|
||
## Data and Resources | ||
|
||
* [The entire YAGO in RDF/TTL/Turtle format](http://resources.mpi-inf.mpg.de/yago-naga/yago3.1/yago3.1_entire_ttl.7z) | ||
* [The entire YAGO in TSV format](http://resources.mpi-inf.mpg.de/yago-naga/yago3.1/yago3.1_entire_tsv.7z) | ||
* [Schema of YAGO in TTL/Turtle/RDF](http://resources.mpi-inf.mpg.de/yago-naga/yago3.1/yagoSchema.ttl.7z) | ||
* [rdf:type facts of YAGO in RDF/TTL/Turtle](http://resources.mpi-inf.mpg.de/yago-naga/yago3.1/yagoTypes.ttl.7z) | ||
* [Taxonomy of YAGO in RDF/TTL/Turtle](http://resources.mpi-inf.mpg.de/yago-naga/yago3.1/yagoTaxonomy.ttl.7z) | ||
|
||
Go to the Web page of YAGO to check individual downloads: [Simplified taxonomy, multilingual, links to DBpedia, geonames, WordNet](http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/downloads/). These include: | ||
|
||
* TAXONOMY: All types of entities, and the class structure of YAGO2s. Moreover, it has formal definitions of YAGO relations. | ||
* SIMPLETAX: An alternative, simpler taxonomy of YAGO. | ||
* CORE: Core facts of YAGO2s, such as the facts between entities, the facts containing literals, i.e., numbers, dates, strings, etc. | ||
* GEONAMES: Geographical entities, classes taken from GeoNames. | ||
* META: Temporally and spatially scoped facts together with statistics and extraction sources about the facts. | ||
* MULTILINGUAL: The multilingual names for entities. | ||
* LINK: The connection of YAGO2s to Wordnet, DBPedia, etc. | ||
* OTHER: Miscellaneous features of YAGO2s, such as Wikipedia in-outlinks, etc. |