Skip to content
@gleanerio

GleanerIO

A set of projects implementing principles around indexing structured data on the web / schema.org (Developed as part of NSF's EarthCube)

GleanerIO

About

Gleaner is a tool for extracting JSON-LD from web pages. You provide Gleaner a list of sites to index and it will access and retrieve pages based on the sitemap.xml of the domain(s). Gleaner can then check for well formed and valid structure in documents and process the JSON-LD data graphs into a form usable to drive a search interface.

Pinned Loading

  1. gleaner gleaner Public

    Gleaner: JSON-LD and structured data on the web harvesting

    Go 17 11

  2. nabu nabu Public

    Nabu: Synchronize data graph objects with a triplestore

    Go 1 3

  3. scheduler scheduler Public

    Scheduling approaches related to gleaner tooling

    Python 1 4

  4. archetype archetype Public

    A testbench repo with the three primary personnas of user, provider and indexer.

    CSS 4 1

  5. notebooks notebooks Public

    Jupyter notebooks for SHACL processing, JSON-LD framing and object operaations

    Jupyter Notebook 1

  6. scienceonschemaexamples scienceonschemaexamples Public

    This repository will contain actual science on schema JSON-LD files, and HTML pages that contain JSON-LD scripts to document as possible test cases

    HTML

Repositories

Showing 10 of 11 repositories
  • scheduler Public

    Scheduling approaches related to gleaner tooling

    gleanerio/scheduler’s past year of commit activity
    Python 1 Apache-2.0 4 8 0 Updated Nov 4, 2024
  • gleaner Public

    Gleaner: JSON-LD and structured data on the web harvesting

    gleanerio/gleaner’s past year of commit activity
    Go 17 Apache-2.0 11 69 (5 issues need help) 1 Updated Oct 15, 2024
  • nabu Public

    Nabu: Synchronize data graph objects with a triplestore

    gleanerio/nabu’s past year of commit activity
  • archetype Public

    A testbench repo with the three primary personnas of user, provider and indexer.

    gleanerio/archetype’s past year of commit activity
    CSS 4 1 5 0 Updated Jul 29, 2024
  • signposting Public

    A test of approaches to parse signposting conventions

    gleanerio/signposting’s past year of commit activity
    Go 0 0 0 0 Updated Jul 18, 2023
  • notebooks Public

    Jupyter notebooks for SHACL processing, JSON-LD framing and object operaations

    gleanerio/notebooks’s past year of commit activity
    Jupyter Notebook 0 1 0 0 Updated May 24, 2023
  • scienceonschemaexamples Public

    This repository will contain actual science on schema JSON-LD files, and HTML pages that contain JSON-LD scripts to document as possible test cases

    gleanerio/scienceonschemaexamples’s past year of commit activity
    HTML 0 0 0 0 Updated Mar 7, 2023
  • gleanerdocs.github.io Public

    Documentation repo

    gleanerio/gleanerdocs.github.io’s past year of commit activity
    0 2 0 2 Updated Feb 3, 2023
  • .github Public

    GleanerIO Profile

    gleanerio/.github’s past year of commit activity
    0 0 0 0 Updated Jan 17, 2023
  • tangram Public

    Simple RESTful wrapper around pySHACL

    gleanerio/tangram’s past year of commit activity
    HTML 2 2 0 1 Updated Jul 18, 2022

Top languages

Loading…

Most used topics

Loading…