Open Education Analytics (OEA) is a fully open-sourced (Creative Commons and MIT) data integration and analytics framework for the education sector built on Azure Synapse - with Azure Data Lake Storage as the storage backbone, Azure Active Directory providing role-based access control, and Azure Purview for data discovery and governance.
This repository contains a set of assets for setting up and walking through a reference implementation of the OEA reference architecture.
The underlying Azure platform services are mature and well documented, but this set of assets utilizing those platform services is very much a work in progress and comes with no warranties nor SLA's, etc. Each organization implementing these assets is responsible for adhering to their own data governance framework and ensuring security and privacy of their data. General guidance for this is provided through the Training Resources listed below. This repo should be considered as a starting point and accelerator for the development of your modern education data estate - and once you have your starting point, it's a matter of iterating and optimizing as you evolve your design and build out the solution you need. We look forward to growing this set of assets in conjunction with you - our customers and partners.
To setup an environment with OpenEduAnalytics, you'll need:
- an Azure subscription (if you don't have an Azure subscription, you can set up a free subscription here, or check the current list of Azure offers)
- role assignment of "Owner" on the Azure subscription you're using
- make sure your preferred subscription is selected as default
az account list --query "[].{SubscriptionId:id,IsDefault:isDefault,Name:name,TenantId:tenantId}"
az account set --subscription <SubscriptionId>
You can setup this fully functional reference architecture (which includes test data sets for basic examples of usage) in 3 steps:
- Open cloud shell in your Azure subscription (use ctrl+click on the button below to open in a new page)
- Download this repo to your Azure clouddrive
cd clouddrive
git clone https://github.com/microsoft/OpenEduAnalytics
- Run the setup script like this (substitute "mysuffix" with your preferred suffix, which must be less than 13 characters and can only contain letters and numbers - this will be used as a suffix in the naming of provisioned resources):
./OpenEduAnalytics/setup.sh mysuffix
(You can refer to this setup video for a quick walkthrough of this process)
By default, the setup script provisions Azure resources in the East US region, but you can choose other locations as well (eg, westus, northeurope).
For example: ./OpenEduAnalytics/setup.sh mysuffix northeurope
For a list of available locations, you can use the command: az account list-locations
You can also choose to have the script create security groups to facilitate the use of role based access control to the data lake.
If you are running the setup for an environment in which you have Global Admin permissions on the tenant, and you want to have security groups provisioned, you can invoke the setup script like this:
./OpenEduAnalytics/setup.sh mysuffix eastus true
By default, the provisioned Azure resources are named according to recommended Azure naming standards, however you can directly modify set_names.sh before running the setup if you want to specify an alternative set of resource names.
Provisioned Azure resources are tagged with oea_version
, if an Azure policy requires a specific tag(s) to be assigned when a resource is created these can be included via $OEA_ADDITIONAL_TAGS
in set_names.sh using the format tagName=tagValue
, if multiple tags are required they need to be space separated.
You can get more info by reviewing the documentation available in the docs folder. This includes a Powerpoint deck with notes and a pdf version that can be viewed in the browser
- OEA_Overview.pptx (note: this won't display in the browser - you have to download the file and open it in Powerpoint)
- OEA_Overview.pdf
And thanks to our friends at Analytikus who graciously offered to translate the deck, we have a spanish version here: OEA_Overview_in_Spanish.pptx, OEA_Overview_in_Spanish.pdf
For more complete details on the installation and usage of the Open Edu Analytics base architecture and test environment, see Open Edu Analytics Solution Guide
For a practical intro to Azure Synapse Analytics, see Cloud Analytics with Microsoft Azure (a free e-book, published in Jan of 2021; also available in other formats for purchase).
The OEA architecture leverages low cost data storage (Azure Data Lake gen2) as well as serverless data platform services that only incur cost when used. This means that the initial cost of an implementation of this architecture is very low, and cost only increases based on increased usage.
We have a cost estimation worksheet that provides a simple model to calculate a cost estimate based on a small number of basic inputs. We will continue to validate this model against actual results seen by our customers and partners and refine it to be more accurate.
Resource | Description |
---|---|
OEA training videos | We have a set of OEA specific training videos that provide an overview of OEA as well as walk throughs of the installation and setup and how to work within Synapse studio and ML studio. |
Azure Fundamentals part 1 | Azure fundamentals is a six-part series that teaches you basic cloud concepts, provides a streamlined overview of many Azure services, and guides you with hands-on exercises to deploy your very first services for free. |
Azure Fundamentals part 2 | Continuation of part 1 |
Azure for the Data Engineer | Explore how the world of data has evolved and how the advent of cloud technologies is providing new opportunities for business to explore. You will learn the various data platform technologies that are available, and how a Data Engineer can take advantage of this technology to an organization benefit. |
Realize Integrated Analytical Solutions with Azure Synapse Analytics | Learn how Azure Synapse Analytics enables you to perform different types of analytics through its’ components that can be used to build Modern Data Warehouses through to Advanced Analytical solutions. |
This project welcomes contributions and suggestions...
Microsoft and any contributors grant you a license to the Microsoft documentation and other content in this repository under the Creative Commons Attribution 4.0 International Public License, see the LICENSE file, and grant you a license to any code in the repository under the MIT License, see the LICENSE-CODE file.
Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653.
Privacy information can be found at https://privacy.microsoft.com/en-us/
Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, or trademarks, whether by implication, estoppel or otherwise.