Skip to content

Fixture management

Matthew Perry edited this page Apr 8, 2014 · 37 revisions

This app relies on a lot of pre-loaded data, much of it required for basic operation.

Models with fixtures installed with import_data command:

  • RasterDataset
  • IdbSummary
  • County
  • FVSSpecies
  • TreeliveSummary
  • FVSAggregate [1]
  • ConditionVariantLookup [1]
  1. Procedures for updating these from new GYBatch runs

  2. All of these data are downloaded from http://labs.ecotrust.org/forestplanner/data/* if they are not already on your local drive. The labs url should be considered the authoritative source for this info.

Models with fixtures installed with initial_data.json (happens automatically on migrate):

  • FVSVariant
  • Rx
  • TimberPrice

If the data get updated, re-create the initial_data fixture and commit it: python manage.py dumpdata --indent=2 trees.fvsvariant trees.rx trees.timberprice > trees/fixtures/initial_data.json

Models with fixtures that you MAY want to install manually:

FVSAggregate and ConditionVariantLookup
  • prep the data per the growth-yield-batch docs
  • build_keys
  • batch_fvs
  • fix errors; if not fixable, remove the entire cond
    DELETE 
    FROM trees_fvsaggregate
    WHERE var = 'EC' and cond in (34101,34179,11385,34175,34194,11385);
  • QC row count
    sqlite> select sum(cnt) 
            from (select var, 
                         count(distinct(cond)),
                         count(distinct(rx)),
                         count(distinct(cond)) * count(distinct(rx)) as cnt 
                  from trees_fvsaggregate group by var) as foo;
    8179
    sqlite> select 8179 * 21;  -- 21 time periods
    171759
    sqlite> select count(*) from trees_fvsaggregate;
    171759
  • determine FP schema (on dbshell, fp dev machine)
SELECT column_name 
from information_schema.columns 
where table_name='trees_fvsaggregate'
order by ordinal_position;

which, after joining with commas and quotes, looks like:

"id", "agl", "bgl", "calc_carbon", "cond", "dead", "offset", "rx", "site", "total_stand_carbon", "var", "year", "merch_carbon_removed", "merch_carbon_stored", "cedr_bf", "cedr_hrv", "ch_cf", "ch_hw", "ch_tpa", "cut_type", "df_bf", "df_hrv", "es_btl", "firehzd", "hw_bf", "hw_hrv", "lg_cf", "lg_hw", "lg_tpa", "lp_btl", "mnconbf", "mnconhrv", "mnhw_bf", "mnhw_hrv", "nsodis", "nsofrg", "nsonest", "pine_bf", "pine_hrv", "pp_btl", "sm_cf", "sm_hw", "sm_tpa", "spprich", "sppsimp", "sprc_bf", "sprc_hrv", "wj_bf", "wj_hrv", "ww_bf", "ww_hrv", "after_ba", "after_merch_bdft", "after_merch_ft3", "after_total_ft3", "after_tpa", "age", "removed_merch_bdft", "removed_merch_ft3", "removed_total_ft3", "removed_tpa", "start_ba", "start_merch_bdft", "start_merch_ft3", "start_total_ft3", "start_tpa"
  • query data.db to match this schema, remove "id", ******** note "site" cast to int

    ** RUN THE QUERY SEEN BELOW **

  • output to csv

.mode tabs
.headers off
.output ../fvsaggregate_batch3.tsv

    SELECT "agl", "bgl", "calc_carbon", "cond", "dead", "offset", "rx", cast("site" as integer) as "site", "total_stand_carbon", "var", "year", "merch_carbon_removed", "merch_carbon_stored", "cedr_bf", "cedr_hrv", "ch_cf", "ch_hw", "ch_tpa", "cut_type", "df_bf", "df_hrv", "es_btl", "firehzd", "hw_bf", "hw_hrv", "lg_cf", "lg_hw", "lg_tpa", "lp_btl", "mnconbf", "mnconhrv", "mnhw_bf", "mnhw_hrv", "nsodis", "nsofrg", "nsonest", "pine_bf", "pine_hrv", "pp_btl", "sm_cf", "sm_hw", "sm_tpa", "spprich", "sppsimp", "sprc_bf", "sprc_hrv", "wj_bf", "wj_hrv", "ww_bf", "ww_hrv", "after_ba", "after_merch_bdft", "after_merch_ft3", "after_total_ft3", "after_tpa", "age", "removed_merch_bdft", "removed_merch_ft3", "removed_total_ft3", "removed_tpa", "start_ba", "start_merch_bdft", "start_merch_ft3", "start_total_ft3", "start_tpa" FROM trees_fvsaggregate;
  • transfer tsv to dev server
  • load new data, dbshell on dev server
    DELETE from trees_fvsaggregate;  -- takes several minutes

    COPY trees_fvsaggregate ("agl", "bgl", "calc_carbon", "cond", "dead", "offset", "rx", "site", "total_stand_carbon", "var", "year", "merch_carbon_removed", "merch_carbon_stored", "cedr_bf", "cedr_hrv", "ch_cf", "ch_hw", "ch_tpa", "cut_type", "df_bf", "df_hrv", "es_btl", "firehzd", "hw_bf", "hw_hrv", "lg_cf", "lg_hw", "lg_tpa", "lp_btl", "mnconbf", "mnconhrv", "mnhw_bf", "mnhw_hrv", "nsodis", "nsofrg", "nsonest", "pine_bf", "pine_hrv", "pp_btl", "sm_cf", "sm_hw", "sm_tpa", "spprich", "sppsimp", "sprc_bf", "sprc_hrv", "wj_bf", "wj_hrv", "ww_bf", "ww_hrv", "after_ba", "after_merch_bdft", "after_merch_ft3", "after_total_ft3", "after_tpa", "age", "removed_merch_bdft", "removed_merch_ft3", "removed_total_ft3", "removed_tpa", "start_ba", "start_merch_bdft", "start_merch_ft3", "start_total_ft3", "start_tpa") 
    FROM '/usr/local/apps/land_owner_tools/lot/fixtures/try_batch3/fvsaggregate_batch3.tsv'
    WITH NULL AS '';

   -- Note COPY <<line number>> and compare to counts above
  • refresh indicies
    REINDEX TABLE trees_fvsaggregate;
    CLUSTER trees_fvsaggregate USING idx_trees_fvsaggregate_cond;
    VACUUM ANALYZE;
  • refresh condvarlookup
    DELETE FROM trees_conditionvariantlookup;

    INSERT INTO trees_conditionvariantlookup(cond_id, variant_code)
      SELECT cond as cond_id, var as variant_code
      FROM trees_fvsaggregate
      GROUP BY cond, var
  • refresh indicies
    REINDEX TABLE trees_conditionvariantlookup;
    VACUUM ANALYZE;
  • check_integrity
python manage.py check_integrity
  • Data migration
    python manage.py clear_cache
    python manage.py shell
  • In the django python shell...
    from trees.models import Scenario, Stand, ScenarioNotRunnable
    from trees.utils import fake_scenariostands

    # how many don't currently have cond_ids
    missing_before = [x.id for x in Stand.objects.filter(cond_id=None)]
    print len(missing_before)

    # Rerun impute_nearest_neighbor task on all existing stands
    for stand in Stand.objects.all():   #  or filter(...):
        stand.cond_id = None
        stand.save()

    # Wait for all stand.cond_id to be reimputed. 
    Stand.objects.filter(cond_id=None).exclude(id__in=missing_before).count()  # should be zero, if not still working or failure

    # investigate failures and fix
    for stand in Stand.objects.filter(cond_id=None).exclude(id__in=missing_before):
        print "--------------------------------"
        print stand, stand.user.username
        try:
            stand.get_cond_id()
        except Exception as e:
            print e

    # Rerun scenarios IF necessary
    for scenario in Scenario.objects.all():
        print "-" * 80
        print scenario, scenario.user, scenario.name
        try:
            fake_scenariostands(scenario)
            scenario.run()
        except ScenarioNotRunnable:
            print "\tNOT RUNNABLE"
  • TODO Create fixtures and upload to server

For dev machines where you want a smaller chunk of test data, use the import_data management command then follow the appropriate migration procedures.