Flowbber - The Data Pipeline Framework
Flowbber is a generic tool and framework that allows to execute custom pipelines for data gathering, publishing and analysis.
pip3 install flowbber
- Addapt pytest source to latest pytest. Starting on pytest 4.6.7 adds a <testsuites> root element. Also, "skips" was changed to "skipped" on newer pytest, support both options.
New Data Splitter Sink.
The new DataSplitterSink allows users to extract branches of the data tree and dynamically create files with that data using other parts of the data tree as ID.
Check the documentation for more information.
Adds new field
total_errorsto Valgrind sources.
Now the Valgrind XML sources will have a new field called
total_errors. The idea of this field is to have a count of the amount of errors that occurred. Users won't need to calculate the length of the errors list manually.
Allow to save the journal in a specific location.
The journal is a JSON file that reports all details about the execution of the pipeline including how many times the pipeline was run, when, which components passed or failed, among others.
The journal was a debug tool but now its user consumable. To accommodate this the format of the journal changed to have more information.
To save the journal issue the flag
examples/journal/journal.jsonfor an example.
Add option to Mongo sink to allow key overwrite.
This allows a pipeline to override an entry in the database using the same key.
[sinks.config] overwrite = true
Add type parsing in env source.
A type can be specified for each environment variable, so that it is parsed and collected with the expected datatype. Types available are:
integer: Using Python's
float: Using Python's
string: Using Python's
auto: Using Flowbber's
boolean: Using Flowbber's
iso8601: Using Flowbber's
[sources.config] include = [ "TESTENV_INT", ] [sources.config.types] TESTENV_INT = "integer"
Pipelines can now be defined in YAML format.
sources: - type: timestamp id: timestamp config: epochf: true iso8601: true strftime: '%Y-%m-%d %H:%M:%S'
Use which genhtml to find executable on lcov_html sink.
This fixes an issue where the executable could not be found if a custom
- The gtest source now supports XML files generated by gtest 1.8.1+.
exclude_filesoptions in many Sinks and Sources. See FilterSink Options for more information.
compressoption added to the archive sink allow to create compressed ZIP archives.
extractoption added to the JSON source allow to load JSON files from ZIP archives.
--derive-func-dataoptions are now available to use on the LCOV source.
- Updated schemas to use Cerberus >=1.3.1 definition.
--dry-runflag allows to parse, load, validate and build a pipeline without executing it.
- Improved logging when trying to instance a component to help debugging a pipeline that went wrong.
- Improved logging to show a log in higher level when things go bad.
- Fix for missing plugin entries in documentation.
- Fix for documentation issue #27.
- New LCOV merger aggregator allows to sum multiple LCOV sources.
- Fix a bug that ignored
rc_overrideswhen using a file input in LCOV source.
- lcov source no longer accepts
directoryas configuration. New option
sourcesuperseded it, and allows to specify a directory to generate a tracefile or load one already generated.
- Refactored Valgrind source to support loading data from Helgrind and DRD tools.
- New "Expander" aggregator that allows to move subdata to top level. This is useful to load data using JSONSource or similar sources and place it in the top level as if it were data from other anonymous sources. Or to replay a pipeline using previously collected data.
- Add support for path in InfluxDB sink.
- Fixed flake8 issues shown in new version.
- Source for Valgrind's memcheck will now always output the
stackattribute as a list.
- New Config source that allows to add arbitrary data directly from the pipeline definition.
- All plugins now show the example usage in both JSON and TOML.
- Improved documentation for the memcheck source.
- The Internet speed source plugin is unavailable as the upstream package providing the measurement is currently broken: fopina/pyspeedtest#15
- Fix in pytest source that caused a test case with both failure and error to be overridden by the other: pytest-dev/pytest#2228
- Minor fix in memcheck source plugin that caused output that violates the expected schema.
- The InfluxDB sink is now compatible with influxdb client version 5.0.0.
- New timezone option for the timestamp source.
- New source for Valgrind's Memcheck.
- Add lcov source and lcov html sink.
- New JSON source for fetch and parse local (file system) or remote (http, https) JSON files.
- The CoberturaSource now returns the list of ignored files.
- TemplateSink now support passing filters.
- All sinks can now filter the input data.
- New FilterAggregator allows to filter the data structure before sending it to the sinks.
- When using the TemplateSink, extra data can now be passed from the pipeline definition to the template by using the new 'payload' configuration option. Fixes #5.
- Each entry from the collected data can now be put into its own collection when using the MongoDBSink. Fixes #2.
- Added a source that counts lines of code in a directory.
- Added a new Git source that provides revision, tag and author information of a git repository.
- New GitHub source that allows to collect statistics of closed / open pull requests and issues.
- New Google Test source.
- Added a "pretty" option to the ArchiveSink to make JSON output pretty. Also, JSON file is now saved in UTF-8.
- Added new source plugin for pytest's JUnit-like XML test results.
- CoberturaSource now supports filenames include and exclude patterns.
- UserSource no longer returns the login key and instead returns a user key.
- Templates used in the TemplateSink can now load sibling templates. Previous way to specify python:// templates changed.
- MongoDBSink now uses None as default for the
keyconfiguration option. Related to #4.
- InfluxDBSink now uses None as default for the
keyconfiguration option. Related to #4.
- Local flowconf can now be reloaded in the same process.
- Fix a deadlock condition when a non-optional component failed with still running siblings components.
- Fixes #6 : InfluxDBSink doesn't support None values.
- Journal is now saved in UTF-8.
- Fixed high CPU usage by the logging manager subprocess.
flowbber.logging.printwill now convert to string any input provided.
- Fix minor typo in EnvSource include / exclude logic.
- The pipeline executor will now join the process of a component (max 100ms) after fetching its response in order to try to get its exit code.
- Added "optional" and "timeout" features to pipeline components.
- Git helpers now live into its own utilities module
- Fixed bug where pipeline execution counter didn't increment.
- Initial version.
Copyright (C) 2017-2019 KuraLabs S.R.L Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.