Easily discover government data
Switch branches/tags
0.0.37 0.0.38 470-google-cdn-2 1166-fix-unit-tests 1353-suppress-vanity-headers 1680-make-broken-link-sleuther-do-big-files 1682 1780 1861 ORIC add-config-to-build-create-secrets-script agri-color authConnectors bump-to-0.0.49 client_graphql closeLinearRings cloud-showcase css-fix datajson-licensejson dataset-page-style disable-inactive-link facet-option-preselect feature/client-preview-merged feature/client-preview feature/es-http-changeover fix-external-url fix_exception graphql_api graphql_filtering graphql_minimal issue-285-auth issue-749 issue-1628c issue-1628 issue/159-preview-zoom issue/893-qspatial-failure issue/893 issue/983 issue/1051-disable-clear-button-on-map-filter issue/1117-Make-dataset-filter-more-obvious issue/1181-Correct-tooltip-text issue/1188 issue/1317-test issue/1600 issue/1640-alt issue/1640-alt2 issue/1640 issue/1653.2 issue/1673 issue/1717-direct-non-prod-emails-to-default-mailbox issue/1756 issue/1779 issue/1780 issue/1823-alt issue/1823 issue/1847 issue/1859-ie11-is-fubar issue/1874 issue/1914 issue/1937-format-minion-crashing logo-glitch logos master merge-0.0.49-to-0.0.50 merge/0.0.50 new-homepage-header new-homepage-searchbox release/add-automatic-deploy-for-release release/0.0.40 release/0.0.42 release/0.0.43 release/0.0.44 release/0.0.45 release/0.0.46 release/0.0.47 release/0.0.48 release/0.0.49 release/0.0.50 scala_registry_graphql_api scoring southaustralia spatial stephencannings-patch-13 stephencannings-patch-16 summarize-sleuther temp-disable-filter-mobile terriajs-connector test-dga test-timeout test-upgrade test trail-darwin-portal uat-dga vis_categories_exp visualisation-overhaul visualisation_sleuther `756
Nothing to show
Clone or download
t83714 Merge pull request #1924 from magda-io/issue/1923-wrong-postgres-pass…
…word-secret-key

Realigned the migrator postgres password with secret generator
Latest commit b505f73 Dec 14, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.circleci Fixed up ui kit being incompatible with itself. Jun 21, 2018
.github Merge remote-tracking branch 'origin/master' into issue/1923-wrong-po… Dec 14, 2018
.vscode Rename sleuthers to minions (Fixes #866) Sep 5, 2018
Logo Added pngs and jpgs of logo Jun 21, 2018
deploy Merge remote-tracking branch 'origin/master' into issue/1923-wrong-po… Dec 14, 2018
docs Remove web-admin Nov 22, 2018
magda-admin-api Add readiness and liveness probes to all services Dec 4, 2018
magda-apidocs-server Add readiness and liveness probes to all services Dec 4, 2018
magda-authorization-api Add readiness and liveness probes to all services Dec 4, 2018
magda-builder-docker Aligning versions. Nov 28, 2018
magda-builder-nodejs Aligning versions. Nov 28, 2018
magda-builder-scala Bumped to v0.0.51-0 Nov 26, 2018
magda-ckan-connector Bumped to v0.0.51-0 Nov 26, 2018
magda-content-api Add readiness and liveness probes to all services Dec 4, 2018
magda-correspondence-api Add readiness and liveness probes to all services Dec 4, 2018
magda-csv-connector Bumped to v0.0.51-0 Nov 26, 2018
magda-csw-connector Bumped to v0.0.51-0 Nov 26, 2018
magda-dap-connector Merge remote-tracking branch 'origin/master' into issue/983-alt Dec 5, 2018
magda-db-migrator Aligning versions. Nov 28, 2018
magda-elastic-search - Added missing line Dec 6, 2018
magda-gateway Add readiness and liveness probes to all services Dec 4, 2018
magda-indexer Fixed: indexer throws an error when processes spatial data number wit… Dec 7, 2018
magda-int-test Merge branch 'master' into issue/983-alt Dec 12, 2018
magda-migrator-authorization-db Aligning versions. Nov 28, 2018
magda-migrator-content-db Bumped to v0.0.51-0 Nov 26, 2018
magda-migrator-registry-db Aligning versions. Nov 28, 2018
magda-migrator-session-db Aligning versions. Nov 28, 2018
magda-minion-broken-link Bumped to v0.0.51-0 Nov 26, 2018
magda-minion-format Bumped to v0.0.51-0 Nov 26, 2018
magda-minion-framework Bumped to v0.0.51-0 Nov 26, 2018
magda-minion-linked-data-rating Aligning versions. Nov 28, 2018
magda-minion-visualization Bumped to v0.0.51-0 Nov 26, 2018
magda-postgres Aligning versions. Nov 28, 2018
magda-preview-map Aligning versions. Nov 28, 2018
magda-project-open-data-connector Bumped to v0.0.51-0 Nov 26, 2018
magda-registry-api Add readiness and liveness probes to all services Dec 4, 2018
magda-registry-aspects Bumped to v0.0.51-0 Nov 26, 2018
magda-scala-common Merge branch 'master' into issue/983-alt Dec 12, 2018
magda-scss-compiler Bumped to v0.0.51-0 Nov 26, 2018
magda-search-api Merge remote-tracking branch 'origin/master' into issue/983-alt Dec 11, 2018
magda-typescript-common Changes based on PR feedback Dec 5, 2018
magda-web-client Merge pull request #1929 from magda-io/issue/1915 Dec 13, 2018
magda-web-server Bumped to v0.0.51-0 Nov 26, 2018
project Modified everything to make it work with gitlab ci preview app builds. Apr 23, 2018
scripts Merge branch 'master' into add-config-to-build-create-secrets-script Dec 12, 2018
.editorconfig Mandated double quotes in javascript. Oct 5, 2017
.eslintignore Got eslint under control. Sep 3, 2018
.gitattributes Not reporting as perl Sep 7, 2018
.gitignore Add readiness and liveness probes to all services Dec 4, 2018
.gitlab-ci.yml Made docker publish run on same tags as helm publish. Nov 12, 2018
.prettierignore Got eslint under control. Sep 3, 2018
.project a Jan 8, 2017
.travis.yml Rename sleuthers to minions (Fixes #866) Sep 5, 2018
CHANGES.md Merge remote-tracking branch 'origin/master' into issue/1923-wrong-po… Dec 14, 2018
CONTRIBUTORS.md Updated CHANGES.md and CONTRIBUTORS.md. Oct 24, 2018
LICENSE Made it stop calling CKAN when being developed locally. Sep 16, 2016
README.md Update README.md Sep 5, 2018
build.sbt Modified everything to make it work with gitlab ci preview app builds. Apr 23, 2018
lerna.json Aligning versions. Nov 28, 2018
magda.code-workspace Rename sleuthers to minions (Fixes #866) Sep 5, 2018
package.json Removed superfluous chart versions. Aug 20, 2018
prettier.config.js Fixed wrong versions. Jun 7, 2018
tsconfig-global.json Prettified everything. Feb 22, 2018
yarn.lock Add readiness and liveness probes to all services Dec 4, 2018

README.md

Magda

GitHub release GitLab Pipeline Try it out at search.data.gov.au Join the chat at https://gitter.im/magda-data/Lobby

Magda is a modern platform built to power a new generation of data portals. Its goal is to improve on existing data portal and management solutions in a number of areas:

  • Discoverability of high-quality and relevant data (particularly through search)
  • Automatic derivation, repair and/or enhancement of data and metadata
  • Seamless federation across multiple data sources
  • Collaboration between data providers and users, as well as between users themselves
  • Quick and effective previewing of datasets, so that the user never has to download a dataset only to find it's not useful
  • An ecosystem that allows extension in any programming language
  • An easy installation and setup process

Magda is a solution for any problem that involves a collection or collections of datasets that need to be searched over, discussed and/or viewed in a single place. It doesn't matter what format the data is in, how well-formed the metadata is, where the data is stored or in how many places, Magda can either work with it or be extended to do so.

The project was started by CSIRO Data61 and Australia's Department of Prime Minister and Cabinet as the future of data.gov.au, and is currently in alpha at search.data.gov.au. As a result it's ideal for powering open data portals, particularly those that involve federating over a number of other more focused portals - for example data.gov.au is a a federal government portal that publishes its own data and makes it available alongside data from department and state portals. However, it can just as easily be run on an organisational intranet as a central private data portal - and can even be set up to include relevant open data in search results alongside private data without exposing any private data to the internet.

Current Status

Magda is currently being actively developed. It's now at the point where there is a reasonably stable, documented API, and it's stable in production at https://search.data.gov.au. Currently the developed features mainly center around its use as an open data search engine - we're currently developing features to allow it to host its own data and be usable for private data too.

Future

Magda has been developed as a search tool for open data, but our ambition is to bring it inside government agencies as well, so that they can use have the same quality of tools for their own private data as they do for open data. We hope to make improvements in a number of areas:

  • An opinionated, highly guided publishing process intended to produce high-quality metadata, rather than simply encourage publishing with any quality of metadata
  • A robust mechanism for authorization that allows for tight controls over who can see what datasets
  • An easy to use administration interface so that the product can be run without needing to use the command line.
  • Workflows to facilitate data sharing and the opening of data, within the software itself

Our current roadmap is available at https://magda.io/docs/roadmap

Architecture

Magda is built around a collection of microservices that are distributed as docker containers. This was done to provide easy extensibility - Magda can be customised by simply adding new services using any technology as docker images, and integrating them with the rest of the system via stable HTTP APIs. Using Kubernetes for orchestration means that configuration of a customised Magda instance can be stored and tracked as plain text, and instances with identical configuration can be quickly and easily reproduced.

Magda Architecture Diagram

Registry

Magda revolves around the Registry - an unopinionated datastore built on top of Postgres. The Registry stores records as a set of JSON documents called aspects. For instance, a dataset is represented as a record with a number of aspects - a basic one that records the name, description and so on as well as more esoteric ones that might not be present for every dataset, like temporal coverage or determined data quality. Likewise, distributions (the actual data files, or URLs linking to them) are also modelled as records, with their own sets of aspects covering both basic metadata once again, as well as more specific aspects like whether the URL to the file worked when last tested.

Most importantly, aspects are able to be declared dynamically by other services by simply making a call with a name, description and JSON schema. This means that if you have a requirement to store extra information about a dataset or distribution you can easily do so by declaring your own aspect. Because the system isn't opinionated about what a record is beyond a set of aspects, you can also use this to add new entities to the system that link together - for instance, we've used this to store projects with a name and description that link to a number of datasets.

Connectors

Connectors go out to external datasources and copy their metadata into the Registry, so that they can be searched and have other aspects attached to them. A connector is simply a docker-based microservice that is invoked as a job. It scans the target datasource (usually an open-data portal), then completes and shuts down. We have connectors for a number of existing open data formats, otherwise you can easily write and run your own.

Minions

A minion is a service that listens for new records or changes to existing records, performs some kind of operation and then writes the result back to the registry. For instance, we have a broken link minion that listens for changes to distributions, retrieves the URLs described, records whether they were able to be accessed successfully and then writes that back to the registry in its own aspect.

Other aspects exist that are written to by many minions - for instance, we have a "quality" aspect that contains a number of different quality ratings from different sources, which are averaged out and used by search.

Search

Datasets and distributions in the registry are ingested into an ElasticSearch cluster, which indexes a few core aspects of each and exposes an API.

User Interface

Magda provides a user interface, which is served from its own microservice and consumes the APIs. We're planning to make the UI itself extensible with plugins at some point in the future.

To try the last version (with prebuilt images)

Use https://github.com/magda-io/magda-config

To build and run from source

https://magda.io/doc/building-and-running

To get help with developing or running Magda

Talk to us on Gitter! Join the chat at https://gitter.im/magda-data/Lobby

Want to talk about deploying this into your agency?

Email us at contact@magda.io.

Want to contribute?

Great! Take a look at https://github.com/TerriaJS/magda/blob/master/.github/CONTRIBUTING.md :).