Updated changes and README

Stiivi · Oct 21, 2013 · 622b09e · 622b09e
1 parent f3e6df6
commit 622b09e
Show file tree

Hide file tree

Showing 2 changed files with 41 additions and 20 deletions.
diff --git a/CHANGES.md b/CHANGES.md
@@ -7,24 +7,28 @@ Changes in Bubbles
 Overview
 --------
 
+* New data processing graph and new graph based `Pipeline` with customizable
+  execution policy and with pre-execution tests
 * New MongoDB backend with a store, data object and few demo ops
 * New XLS backend with a store and data object
-* New feature: data processing graph and new graph based `Pipeline`
 * New operations (see below)
 
-Operation Changes
------------------
+Operations
+----------
 
-New:
+New operations:
 
 * `filter_by_range`, `filter_not_empty`: rows, sql
 * `split_date`: rows, sql
-* `string_to_date`: rows – still experimental, format will change to SQL date
-  format
 * `field_filter`: mongo (without `rename`)
 * `distinct`: mongo
+* `insert`: (rows, sql) and (sql, sql)
+* `assert_contains`, `assert_missing`: sql
+* `empty_to_missing`: rows – experimental
+* `string_to_date`: rows – still experimental, format will change to SQL date
+  format
 
-Changes and fixes:
+Changed and fixed operations:
 
 * `aggregate` accepts empty measure list – yields only count
 
@@ -38,17 +42,31 @@ New Features
 * new `FieldError` exception
 * Take into account object's data consumability on object use (naive
   implementation for the time being)
-* CSVStore (`csv`) is now able to create CSV targets
+* CSVStore (`csv`) is now able to create CSV targets with `csv_target` factory
+  name
+* New `Resource` class representing file-like resources with optional call to
+  `close()`
+* Added `FileSystemStore` for read-only CSV and XLS files with default
+  settings.
+* Added `Store.exists()`, implemented in SQL backend.
+* `ProbeAssertionError` has a `reason` attribute
+
+Pipeline and execution:
+
 * `Graph` and `Node` structure for building operation processing graphs
 * operation list has an operation prototype that includes operation operand
   and parameter names
 * Added `ExecutionEngine`, currently semi-private, but will serve as basis for
   future custom graph execution policies
-* Added thread_local - thread local variable storage
-* New `Resource` class representing file-like resources with optional call to
-  `close()`
 * Added `Pipeline.execution_plan`
+* Added thread_local - thread local variable storage
 * Added `retry_deny` and `retry_allow` to the operation context
+* Added insert operation accessible through `Pipeline.insert_into` and
+  `Pipeline.insert_into_object`
+* Added `test_if_needed()` and `test_if_satisfied()` methods which are fork()
+  -like but executed before running the pipeline (see documentation for more
+  information)
+
 
 Changes
 -------
@@ -65,6 +83,11 @@ Changes
 * operation context's `o` accessor was renamed to `op` and now also supports
   getitem: `context.op["duplicates"]` is equal to `context.op.duplicates`.
 * data objects should respond to `retained()` and `is_consumable()`
+* default field storage type is now `string` instead of `unknown` for
+  convenience.
+* Removed default setting for debug logging, uses warning level
+* Renamed namespace object name customization class variable `_ns_object_name`
+  to `__identifier__`
 
 Fixes
 -----

diff --git a/README.markdown b/README.markdown
@@ -3,10 +3,9 @@ Bubbles
 
 Also known as Brewery2.
 
-Library and set of tools for processing, auditing and inspecting data using
-virtual data objects.
-
-Focus is on understandability and transparency of the process.
+Bubbles is a Python ETL Framework and set of tools. It can be used for
+processing, auditing and inspecting data. Focus is on understandability and
+transparency of the process.
 
 Project page: http://bubbles.databrewery.org
 
@@ -17,13 +16,13 @@ About
 
 Bubbles is a Python framework for:
 
-* virtual data objects – abstraction of table-like structured datasets.
-  Datasets are treated the same, no matter whether the source is a text file
-  or a database table.
+* ETL (extraction, transformation and loading)
 * preparation of data for further analysis
 * data probing – analysing properties of data, mostly categorical in nature
-* ETL (extraction, transformation and loading)
 * data quality monitoring
+* virtual data objects – abstraction of table-like structured datasets.
+  Datasets are treated the same, no matter whether the source is a text file
+  or a database table.
 
 Installation
 ------------
@@ -65,7 +64,6 @@ Google group or write to the author.
 * Report issues here: https://github.com/Stiivi/bubbles/issues
 * Google group: http://groups.google.com/group/databrewery
 
-
 Author
 ------