The Research Analytics Suite (RAS), developed by Lane within Gire Lab at the University of Washington, is a comprehensive, open-source platform written in Python for aggregating and analyzing scientific data from diverse sources. RAS is designed to be free and accessible, addressing financial and accessibility barriers in scientific research.
Please note: RAS is currently under active development and is not yet ready for public use. This repository is intended for demonstration purposes and to showcase the project's structure and features while it is being developed.
Key Features:
- Data Management Engine (DME): Filters and aggregates large, complex datasets from multiple sources.
- Analytics Suite: Includes tools for research data analysis, advanced statistics, machine learning algorithms, and data visualization.
- Preloaded Functions: Ready-to-use functions for common analysis tasks.
- Custom Functions: Allows users to create and implement custom analysis functions.
- Future Integration: Designed for compatibility with tools like DeepLabCut.
RAS aims to foster a collaborative research community, enabling scientists and researchers to share their analytic workflows and contribute to a repository of shared knowledge, accelerating scientific discovery and innovation.
Table of Contents
The Research Analytics Suite (RAS) is a cutting-edge, open-source platform developed in Python to address the diverse needs of scientific data analysis. RAS stands out by offering a comprehensive suite of tools for data aggregation, management, and analysis, derived from various input sources such as pixel-tracking technology, accelerometers, and analog voltage outputs.
RAS aims to democratize access to powerful data analysis tools traditionally dominated by commercial software. By eliminating financial barriers, RAS empowers researchers, educators, and industry professionals to conduct sophisticated analyses without the associated costs.
- The Research Analytics Suite aspires to cultivate a collaborative research community. It envisions a platform where scientists and researchers can share their analytic workflows, collaborate on projects, and contribute to a growing repository of shared knowledge and resources. This collaborative spirit aims to accelerate scientific discovery and innovation by leveraging the collective expertise of the global research community.
- By providing a versatile and accessible toolset, RAS not only enhances the efficiency and effectiveness of data analysis but also fosters a culture of open collaboration and shared progress in the scientific community.
- A robust system for filtering and aggregating large, complex datasets from multiple sources. The DME ensures seamless integration and handling of diverse data types, facilitating comprehensive and efficient data analysis.
- Offers an extensive array of tools for research data analysis, including advanced statistical methods, machine learning algorithms, and data visualization techniques. The analytics suite is designed to be both powerful and flexible, catering to the specific needs of each user.
- Preloaded Functions: A library of ready-to-use functions for common analysis tasks, enabling users to quickly apply standard methods without extensive setup.
- Custom / User-Defined Functions: Allows users to create and implement their own analysis functions, fostering innovation and customization in research workflows.
- RAS is designed with future compatibility in mind, aiming to integrate seamlessly with other leading tools in the field, such as DeepLabCut, to expand its capabilities further.
RAS is built using a variety of powerful tools and libraries to ensure robust functionality and performance, including (but not limited to):
Package | Description |
---|---|
Python: The core programming language used for the development of RAS. | |
Dask: A flexible parallel computing library for analytic computing. | |
PyTorch: An open-source machine learning framework for deep learning. | |
TensorFlow: An end-to-end open-source platform for machine learning. | |
Distributed: A library for distributed computing with Python. | |
DearPyGui: An easy-to-use, high-performance GUI framework for Python. | |
DearPyGui-Async: An extension for DearPyGui to support asynchronous operations. | |
Cachey: A caching library for managing the lifecycle of cached objects. | |
Matplotlib: A comprehensive library for creating static, animated, and interactive visualizations in Python. | |
NumPy: The fundamental package for scientific computing with Python. |
To get a local copy up and running, follow the following steps.
An executable package of RAS will be available for each platform once the project is ready for prototyping and deployment. In the meantime...
- Ensure you have Python 3.8 or later installed:
An executable package of RAS will be available for each platform once the project is ready for prototyping and deployment. In the meantime...
-
Clone the repo
git clone https://github.com/lane-neuro/research-analytics-suite.git
-
Navigate to the project directory
cd research-analytics-suite
-
Install the required packages. This can be done in one of two ways:
- If you are using Anaconda as a virtual environment, you can use the supplied
environment.yml
file to create a new environment with all the required packages.
To do this, run the following command in the terminal:
conda env create -f environment.yml
Then, activate the environment:
conda activate research-analytics-suite
OR
[the following option is typically not recommended given it installs all requirements on your global python path]
- Alternatively, you can install the required packages globally. While this is the easier route, it is typically not recommended.
If this is what you wish to do, run the following command in the terminal:
pip install -r requirements.txt
- If you are using Anaconda as a virtual environment, you can use the supplied
-
Run the project using the following command in the terminal:
- (see Command Line Arguments below for customization)
python ResearchAnalyticsSuite.py
You can provide the following command line arguments to customize the behavior of the Research Analytics Suite:
-g
,--gui
: Launches the Research Analytics Suite GUI- default:
'true'
).
- default:
-o
,--open_workspace
: Opens or creates a workspace at the specified directory.- default:
'~/Research-Analytics-Suite/workspaces/default_workspace'
- default:
You can specify the workspace to open or create using the -o
or --open_workspace
argument.
python ResearchAnalyticsSuite.py -o ~/Research-Analytics-Suite/workspaces/a_new_workspace
-
This will create a new workspace at
~/Research-Analytics-Suite/workspaces/a_new_workspace
if it does not already exist. -
If it does exist, it will open the existing workspace at that location.
-
Note: The path much be to a directory, not a file.
- So if you want to open a
'config.json'
file located at~/Research-Analytics-Suite/workspaces/look_another_workspace/config.json
, you would instead use the following command:
python ResearchAnalyticsSuite.py -o ~/Research-Analytics-Suite/workspaces/look_another_workspace
- This will open the workspace configuration file located within the
~/Research-Analytics-Suite/workspaces/look_another_workspace
directory.
- So if you want to open a
The GUI is the default mode of operation for RAS. However, you can explicitly specify the GUI mode by using the following command:
python ResearchAnalyticsSuite.py -g true
- This will launch RAS with the GUI; this essentially has the same effect as running RAS without any command-line arguments.
- This is the default mode of operation for RAS, providing an interactive interface for data analysis and visualization.
The GUI is the default mode of operation for RAS. However, you can run RAS in command-line mode only by using the following command:
python ResearchAnalyticsSuite.py -g false
- This will launch RAS without the GUI, running in command-line mode only.
- This is useful for running RAS in headless mode or for scripting purposes; such as batch processing, automation, or integration with other tools.
Fluid and subject to change as the project is developed further. Refer to the code for the most up-to-date information.
The operation_manager
package orchestrates and manages data processing operations within RAS.
- chains: Handles sequences of operations.
- control: Manages control mechanisms.
- execution: Handles operation execution.
- lifecycle: Manages the lifecycle of operations.
- management: Includes operation management functionalities.
- nodes: Manages operation nodes.
- operations:
- computation: Includes computational operations.
- core: Provides core functionalities for operations.
- inheritance: Provides functionalities for managing child operations in a parent operation.
- control: Provides control functionalities for operations including start, pause, resume, and stop.
- execution: Provides execution functionalities for operations including preparation and execution of actions.
- progress: Provides functionalities for tracking and updating the progress of operations.
- workspace: Provides functionalities for workspace interactions, loading, and saving operations.
- system: Includes common system operations, such as
ResourceMonitorOperation
andConsoleOperation
- task: Manages all tasks associated with operations.
The gui
package provides graphical user interfaces for interacting with RAS.
Note: The GUI package will be optional in future distributions, given RAS has real-time command-line interface integration.
- assets: Contains GUI assets, such as images and icons.
- base: Base GUI components.
- dialogs: Contains dialog components, divided into subcategories:
- data_handling: Dialogs related to data handling.
- visualization: Dialogs for data visualization.
- settings: Settings-related dialogs.
- management: Management-related dialogs.
- launcher: GUI launcher scripts.
- modules: Different GUI modules.
- utils: Utility scripts for GUI components.
The data_engine
package handles the primary functionality for data processing and management within a project.
- core: Core data processing modules.
- data_streams: Handles live data input streams.
- engine: Data engine implementations.
- integration: Integration with external data sources, such as Amazon S3.
- utils: Utility scripts for data handling.
- variable_storage: Manages variable storage.
- storage: Different storage backends for variables.
The analytics
package handles the application and visualization of data transformations within a project.
- core: Core analytical processing and transformations.
- custom_user: Custom user-defined transformations and configurations.
- evaluation: Metrics and evaluation scripts for model performance.
- models: Machine learning and statistical models.
- prediction: Modules for making predictions using trained models.
- preloaded: Preloaded transformations and configurations.
- transformations: Preloaded transformation modules.
- preprocessing: Data preprocessing modules.
- training: Modules for training machine learning models.
- utils: Utility functions and common metrics.
- visualization: Display and visualization of analytical transformations.
To be implemented at a later date.
Contributions are what make the open source community such an amazing place to learn, inspire, and create.
Open-source contributions will be available in the near future. Star & watch the project to stay tuned for updates!
Distributed under the BSD 3-Clause License. See LICENSE
for more information.
Neurobiological Research Technician
Gire Lab, University of Washington
email: justlane@uw.edu
Associate Professor, Principal Investigator
Gire Lab, University of Washington
email: dhgire@uw.edu
Project Link: https://github.com/lane-neuro/research-analytics-suite