Continued working on the tutorial.

OpenDataAlex · Jul 29, 2014 · c001de7 · c001de7
1 parent 0539011
commit c001de7
Show file tree

Hide file tree

Showing 3 changed files with 65 additions and 2 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -9,7 +9,6 @@ script:
     - "coverage run --source=etltest setup.py test"
     - 'tox'
 before_script:
-  - mysql -e 'create database etlUnitTest;'
   - mysql -e 'source scripts/etlUnitTest_build.sql'
   - wget http://sourceforge.net/projects/pentaho/files/Data%20Integration/5.0.1-stable/pdi-ce-5.0.1.A-stable.zip/download
   - unzip download

diff --git a/docs/source/tutorial/creating_sample_data_set.rst b/docs/source/tutorial/creating_sample_data_set.rst
@@ -1,2 +1,63 @@
 Creating A Sample Data Set
-==========================
+==========================
+
+Now that we have written our three tests, it's time to create a data set so that we can accurately test them.
+Remember, we have three tests that will require data:
+
+*  Does first name get lower cased?
+*  Does an upper case first name not return as upper case in the target table?
+*  Does the birthday field get impacted by the data integration code?
+
+First, let's create a new folder in our data directory (default is ``${ETL_TEST_ROOT}/Documents/etlTest/data``).::
+
+    cd ${ETL_TEST_ROOT}/Documents/etlTest/data
+    mkdir etlUnitTest
+
+We created the ``etlUnitTest`` directory because that is the source where the data set we're about to create lives.
+Since the ``users`` table is the source for our data integration, we should create a new YAML file called users.yml .::
+
+    touch etlUnitTest/users.yml
+    vi etlUnitTest/users.yml
+
+.. include:: yaml_details_stub.rst
+
+Now let's actually build our data set.  Remember, we need a data set that will meet the requirements for our tests.
+For our first record, let's include a standard, run of the mill users table record.::
+
+    1:
+    # Generic record from the users table.
+      user_id: 1
+      first_name:  Bob
+      last_name:  Richards
+      birthday:  2000-01-04
+      zipcode:  55555
+
+Notice, the record is identified uniquely with ``1`` and that all the fields for record one are indented two spaces
+to indicate they are all together.  To give a value to a field, we just put a colon followed by a space and then the
+value we need for it. i.e. ``column_name: column_value``.
+
+The record we just created will work fine for our first test case, but what do we do for the next one?  We could copy
+the record and change the first_name field to ``BOB``, but that could run the risk of test collision when our test
+suites and data sets get larger.  Let's build a new record specific to this test: ::
+
+    1:
+    # Generic record from the users table.
+      user_id: 1
+      first_name:  Bob
+      last_name:  Richards
+      birthday:  2000-01-04
+      zipcode:  55555
+    2:
+    # Record for first_name all upper case.
+      user_id: 2
+      first_name:  SARAH
+      last_name: Jenkins
+      birthday:  2000-02-02
+      zipcode:  12345
+
+We indicate a new record in the YAML file by removing any indentation in the next line after the zipcode column for
+record one and give our record another unique identifier (this time ``2``).  We use the same column names as before,
+but we now have a record that has an entirely upper-cased first_name field.
+
+For the third test case, we could create a new record or we can utilize one of the existing records to test if the
+birthday field is manipulated.  For the birthday test, we will use record one.  Now we can work on building our tests.
diff --git a/docs/source/tutorial/yaml_details_stub.rst b/docs/source/tutorial/yaml_details_stub.rst
@@ -0,0 +1,3 @@
+YAML (which stands for YAML Ain't a Markup Language) was designed to provide some of the same capabilities of XML
+without the verboseness.  To find out more about YAML, head over to `The Official YAML Website <http://www.yaml
+.org/>`_ .