Skip to content
Find file History
Pull request Compare This branch is 2 commits behind infochimps-data:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
..
Failed to load latest commit information.
pagecounts @ 3ad7299
README.md
dbpedia-sentences.tsv

README.md

0 contents-all.txt 34 01-Aug-2012 11:26 9.3K 0 contents-nq.txt 38 01-Aug-2012 11:26 2.1K 0 contents-nt.txt 36 01-Aug-2012 11:26 2.5K 0 contents-tql.txt 37 01-Aug-2012 11:26 2.1K 0 contents-ttl.txt 35 01-Aug-2012 11:26 2.5K 0 instance_types_en.nq.bz2 0 29-Jun-2012 13:18 97M Contains triples of the form $object rdf:type $class from the ontology-based extraction. 2 mappingbased_properties_unredirected_en.nq.bz2 10 29-Jun-2012 03:11 251M High-quality data extracted from Infoboxes using the ontology-based extraction. The predicates in this dataset are in the /ontology/ namespace. Used to be called Mapping Based Properties in previous releases. 2 specific_mappingbased_properties_en.nq.bz2 30 29-Jun-2012 08:05 11M Infobox data from the ontology-based extraction, using units of measurement more convenient for the resource type, e.g. square kilometres instead of square metres for the area of a city. 3 labels_en.nq.bz2 16 25-Jul-2012 15:29 208M Titles of all Wikipedia Articles in the corresponding language. 4 short_abstracts_en.nq.bz2 7 25-Jul-2012 18:29 382M Short Abstracts (max. 500 characters long) of Wikipedia articles 5 long_abstracts_en.nq.bz2 6 25-Jul-2012 15:33 682M Full abstracts of Wikipedia articles, usually the first section. 7 geo_coordinates_en.nq.bz2 24 28-Jun-2012 21:25 20M Geographic coordinates extracted from Wikipedia. 9 homepages_en.nq.bz2 29 29-Jun-2012 13:18 13M Links to homepages of persons, organizations etc. 10 persondata_unredirected_en.nq.bz2 21 29-Jun-2012 04:39 72M 14 article_categories_en.nq.bz2 12 28-Jun-2012 22:23 249M Links from concepts to categories using the SKOS vocabulary 14 category_labels_en.nq.bz2 27 29-Jun-2012 11:00 16M Labels for Categories. 16 external_links_en.nq.bz2 18 28-Jun-2012 21:23 185M Links to external web pages about a concept. 16 page_links_unredirected_en.nq.bz2 2 29-Jun-2012 00:15 1700G Dataset containing internal links between DBpedia instances. The dataset was created from the internal links between Wikipedia articles. The dataset might be useful for structural analysis, data mining or for ranking DBpedia instances using Page Rank or similar algorithms. 16 redirects_transitive_en.nt.bz2 1 12-Jul-2012 11:00 92M Redirects dataset in which multiple redirects have been resolved and redirect cycles have been removed. 17 disambiguations_unredirected_en.nq.bz2 28 29-Jun-2012 14:49 15M Links extracted from Wikipedia disambiguation pages. Since Wikipedia has no syntax to distinguish disambiguation links from ordinary links, DBpedia has to use heuristics. 18 page_ids_en.nq.bz2 15 27-Jul-2012 22:58 216M Dataset linking a DBpedia resource to the page ID of the Wikipedia article the data was extracted from. l1 geonames_links.nt.bz2 40 xx xxM Links between geographic places in DBpedia and data about them from GeoNames. Links created by Silk link specifications. n1 topical_concepts_unredirected_en.nq.bz2 31 09-Jul-2012 18:37 1.7M

  • 6 images_en.nq.bz2 20 29-Jun-2012 01:20 103M Main image and corresponding thumbnail from Wikipedia article.
  • 8 infobox_properties_en.nq.bz2 4 25-Jul-2012 14:49 723M Information that has been extracted from Wikipedia infoboxes. Note that this data is in the less clean /property/ namespace. The Ontology Infobox Properties (/ontology/ namespace) should always be preferred over this data.
  • 8 infobox_properties_unredirected_en.nq.bz2 5 29-Jun-2012 01:06 722M Information that has been extracted from Wikipedia infoboxes. Note that this data is in the less clean /property/ namespace. The Ontology Infobox Properties (/ontology/ namespace) should always be preferred over this data.
  • 8 infobox_property_definitions_en.nq.bz2 39 29-Jun-2012 03:38 1.4M All properties / predicates used in infoboxes.
  • 8 infobox_test_en.nq.bz2 9 28-Jun-2012 22:14 262M
  • 1 mappingbased_properties_en.nq.bz2 11 25-Jul-2012 15:39 251M High-quality data extracted from Infoboxes using the ontology-based extraction. The predicates in this dataset are in the /ontology/ namespace. Used to be called Mapping Based Properties in previous releases. -10 pnd_en.nq.bz2 33 29-Jun-2012 14:49 45K Dataset containing PND (Personennamendatei) identifiers. -12 interlanguage_links_en.nq.bz2 17 25-Jul-2012 15:15 205M Dataset linking a DBpedia resource to the same or a related resource in other languages, extracted from the inter-language links of a Wikipedia article. -15 skos_categories_en.nq.bz2 23 29-Jun-2012 04:17 41M Information which concept is a category and how categories are related using the SKOS Vocabulary. -16 redirects_en.nq.bz2 19 12-Jul-2012 12:14 119M Dataset containing redirects between articles in Wikipedia. -16 page_links_en.nq.bz2 3 25-Jul-2012 16:25 1700G Dataset containing internal links between DBpedia instances. The dataset was created from the internal links between Wikipedia articles. The dataset might be useful for structural analysis, data mining or for ranking DBpedia instances using Page Rank or similar algorithms. -17 disambiguations_en.nq.bz2 26 25-Jul-2012 14:30 16M Links extracted from Wikipedia disambiguation pages. Since Wikipedia has no syntax to distinguish disambiguation links from ordinary links, DBpedia has to use heuristics. -17 iri_same_as_uri_en.nq.bz2 25 25-Jul-2012 17:39 16M owl:sameAs links between the IRI and URI format of DBpedia resources. Only extracted when IRI and URI are actually different. -19 revision_ids_en.nq.bz2 14 27-Jul-2012 23:07 225M Dataset linking a DBpedia resource to the revision ID of the Wikipedia article the data was extracted from. Until DBpedia 3.7, these files had names like 'revisions_en.nt'. Since DBpedia 3.8, they were renamed to 'revisions_ids_en.nt' to distinguish them from the new 'revision_uris_en.nt' files. -19 revision_uris_en.nq.bz2 13 27-Jul-2012 23:17 243M Dataset linking DBpedia resource to the specific Wikipedia article revision used in this DBpedia release. -10 persondata_en.nq.bz2 22 25-Jul-2012 18:26 72M Information about persons (date and place of birth etc.) extracted from the English and German Wikipedia, represented using the FOAF vocabulary. -n1 topical_concepts_en.nq.bz2 32 25-Jul-2012 18:32 1.7M We tokenize all Wikipedia paragraphs linking to DBpedia resources and aggregate them in a Vector Space Model of terms weighted by their co-occurrence with the target resource. We use those vectors to select the strongest related terms and build topic signatures for those entities. -16 wikipedia_links_en.nq.bz2 8 29-Jun-2012 04:05 311M Dataset linking DBpedia resource to corresponding article in Wikipedia.

parallel -j3 wget -nc -x http://downloads.dbpedia.org/3.8/en/{} ::: contents-all.txt contents-nq.txt contents-nt.txt contents-tql.txt contents-ttl.txt instance_types_en.nq.bz2 mappingbased_properties_unredirected_en.nq.bz2 specific_mappingbased_properties_en.nq.bz2 labels_en.nq.bz2 short_abstracts_en.nq.bz2 long_abstracts_en.nq.bz2 geo_coordinates_en.nq.bz2 homepages_en.nq.bz2 persondata_unredirected_en.nq.bz2 article_categories_en.nq.bz2 category_labels_en.nq.bz2 external_links_en.nq.bz2 redirects_transitive_en.nt.bz2 wikipedia_links_en.nq.bz2 disambiguations_unredirected_en.nq.bz2 page_ids_en.nq.bz2 topical_concepts_unredirected_en.nq.bz2

parallel -j3 wget -nc -x http://downloads.dbpedia.org/3.8/links/{} ::: geonames_links.nt.bz2 musicbrainz_links.nt.bz2 nytimes_links.nt.bz2 uscensus_links.nt.bz2 wordnet_links.nt.bz2 yago_links.nt.bz2

Links to Music Brainz Links to New York Times Links to US Census Links to Word Net Classes

Something went wrong with that request. Please try again.