Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
BigSemantics is a free and open-source software architecture for developing powerful applications that derive and present explorable semantic information (metadata) from diverse information sources. Semantics summarize underlying information resources, for exploration within a unified context.
An example is the Metadata In-Context Expander (MICE) which helps the user maintain context while exploring linked semantic information. The MICE homepage shows semantic information for a CD on Amazon. Linked information, such as related bestseller lists or items people also buy, can be expanded in the same context. Some more examples:
The Getting Started section explains how you can use MICE in your own web application. BigSemantics supports a wide range of semantic types and web sites. In the next section, we will explain how you can easily extend BigSemantics to support new web sites that are useful to your application.
Supporting New Sites: Meta-Metadata Language
The real power of BigSemantics is that it provides an easy-to-learn, declarative language called meta-metadata for developers to specify what semantic information they need and how it should be presented for any web sites that are useful to their application. No special HTML tags are needed from the source web site.
These specifications are written in code blocks called wrappers, each corresponding to a type of entities (such as
product) and a source of information (such as Amazon). Common semantic types can be conveniently reused through inheritance, reducing the effort needed to develop new wrappers. BigSemantics comes with a repository of wrappers, organized in a hierarchy. Learn more about the meta-metadata language in the Getting Started section.
This section helps you get started on using BigSemantics in your own application.
Checking Out Code
All BigSemantics projects are hosted inside an umbrella project BigSemantics, in the form of git submodules. You should follow this step-by-step guide to checkout the code and set up a development environment for using and developing with BigSemantics. The guide also explains how you can share your work on wrappers or other parts of BigSemantics to the community through pull requests.
Using BigSemantics in Web Applications
Meta-Metadata: Language for Authoring Wrappers
With BigSemantics, developers and other curators author application-independent, reusable code blocks called wrappers, in the meta-metadata language, to specify data models, extraction rules, and presentation semantics of metadata. Data models, extraction rules, and presentation semantics are integrated together in wrappers to constitute metadata types. BigSemantics comes with a repository of wrappers supporting a wide range of types and sources.
The meta-metadata language supports representing nested, cross-linked, and recursive data models. For example, metadata for books can contain metadata for their authors, while metadata for authors can again contain metadata for their books.
At runtime, BigSemantics (in the form of a library or a web service) extracts metadata from web pages using wrappers. To make semantics conveniently accessible from client programs (e.g. applications written in Java or C#), BigSemantics automatically generates native classes using data models defined in wrappers, which are called metadata classes. Extracted metadata is mapped to native Java or C# instances of metadata classes for use in program.
Learn about how to use the meta-metadata language to author wrappers for any web sites. The tutorial walks through data model definitions, information extraction, and metadata presentation in details, as well as useful tools to assist development. Developers with object oriented programming experiences will find it easy to understand and use.
We host a BigSemantics service instance that can be accessed freely by the public (without guarantee of availability). You can host your own instance by following the guide.
Developing with BigSemantics
The following sources offer tutorials on developing with BigSemantics in other languages or platforms:
- Building Java Applications with BigSemantics
- Building Android Applications with BigSemantics
- Building C# Applications with BigSemantics
You are welcomed to contribute to BigSemantics! The software architecture is illustrated in this page. For further details on how to extend the core of BigSemantics, including processing wrappers and types, extracting semantic information, and maintaining a graph of linked documents, see this tutorial.
Publications and Useful Links
Check out the research papers and background for students. You can view [demos] (http://ecologylab.github.io/BigSemantics/) made with BigSemantics. A useful page for viewing and comparing metadata for a website is the IdeaMACHE Dropzone.
For general support, please contact us at firstname.lastname@example.org.