# Quality Flag System

## Motivation and concept

The data-quality flag system has been added to the production codes in version SNF-02-03. In the previous versions, the "QualityS" string was used to store one message per process. The goal of this new system is to flag a process using multiple warnings or errors, while allowing inheritance from parents to children, and finally be able to use simple queries to find them in the DB.

A given process if flagged as good or bad if its "Quality" key (int) is respectiveley set to 1 or 2. Its "QualityS" key is a string containing up to 40 characters, each of them containing 8 bits. 320 bits could be thus available, each of them being a possibility to store a specific warning or error message, if correctly associated to an official and immutable list of flags. 

For a given (new) process with no current flag, the QualityS is initially an empty string (''). If ‘a’ and ‘c’ are activated flags from a list of existing flags (‘a’, ‘b’, ‘c’, etc), we thus have two activated bits of the first byte (character), of the form 101 (10100000). ‘a’ has index 0, ‘c’ has index 2 in this example. Indexes are fixed for a given flag, i.e, it will ALWAYS be 0 for ‘a’ in our example. The corresponding string is then saved in the DB as unicode. 

To distinguish the new flag system from already stored info, each new flag will starts with a "\$" character. We thus loose one character, so 8 bits. In practice, null byte (00000000) cannot be stored into the DB in a string, so we force them all to have their first bit set to 1 (except for the first one, "$"). We thus loose 39 more bits. In total, we have access to 273 bits. Eeach of them could be a different warning/error/flag. If a process inherits a quality string from its parents, the QualityS info will be updated by adding new flags, keeping info from its parents.

In the next sections is presented how warnings are stored into these bytes, and how to access, read, modify, save and query them in/from the DB.

## Official list of flags 
The current official list of flags is stored in the libRecord library. To get them all, simply use the following python command:

In [6]:
import libRecord as LR
LR.FLAGS_INFO

[('ES_PRIOR_POSITION', 'ES position away from prior'),
 ('ES_PRIOR_SEEING', 'ES seeing away from prior'),
 ('ES_PRIOR_AIRMASS', 'ES airmass away from prior'),
 ('ES_PRIOR_PARANGLE', 'ES parangle away from prior'),
 ('ES_MIS-CENTERED', 'ES position mis-centered'),
 ('PFC_XNIGHT', 'PFC cross-night flux calibration'),
 ('PFC_RELFLX', 'PFC relative flux calibration'),
 ('ARTIFICIAL_ARC', 'Is or has a parent usin an artifical arc (004_901)')]

The presence of the following flags will always be true. None of them can be deleted, and only new flags can be added to this list. For now, this step is manual and has to be done with caution, making sure that none of them disapear, and that their order stay the same.

## Query the DB for a flag
A function of `libRecord` called `query_flag_in_db` is used to query the DB for processes having one of the official flags. This function returns a list of processes filtered by their QualityS. 

In [9]:
for f in LR.FLAGS:
    print "%i processes with flag %s found" % (LR.query_flag_in_db(f).count(), f)

1 processes with flag ES_PRIOR_POSITION found
7887 processes with flag ES_PRIOR_SEEING found
267 processes with flag ES_PRIOR_AIRMASS found
1529 processes with flag ES_PRIOR_PARANGLE found
1791 processes with flag ES_MIS-CENTERED found
1615 processes with flag PFC_XNIGHT found
57522 processes with flag PFC_RELFLX found
11207 processes with flag ARTIFICIAL_ARC found


What is usually returned by this function is a query set of the Process table. This query set can then be used as any other set. Let's take all the processes containing the flag "ARTIFICIAL_ARC" as an example.

In [13]:
flag = 'ARTIFICIAL_ARC'
procs = LR.query_flag_in_db(flag)
print procs.count(), "processes found with flag", flag
# now let's apply some filtering
print procs.filter(Fclass=666, XFclass=800).count(), "host-subtracted (cubefit) and flux-caibrated spectra found with this flag"

11207 processes found with flag ARTIFICIAL_ARC
355 host-subtracted (cubefit) and flux-caibrated spectra found with this flag


You can also abtain the query used to get them, instead of the processes:

In [19]:
from SnfQuery import Q
from processing.process.models import Process
q = LR.query_flag_in_db(flag, queryonly=True)
q &= Q(Fclass=666, XFclass=800)
procs = Process.objects.filter(q)
print procs.count(), "processes found (as found above)"

355 processes found (as found above)


## How does it work?

### Create a flag instance and add new flags

In [3]:
# A list of flags used for test purpose
print LR.TESTFLAGS

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']


In [4]:
# Create a new flag instance using the test mode, i.e, the test flags
f = LR.Flag(test=True)

In [5]:
# First test, add new flags, that should be existing in the list of test flags (TESTFLGAS)
f.set_flags('test')

Flag 'test' not in the official list of flags


In [6]:
# Check the list of existing flags of this instance
f.list_flags()

[]


In [7]:
# Add flags that actually exit
f.set_flags(['A', 'M', 'R', 'b', 'v', 'z'])

Setting flag 'A'
Setting flag 'M'
Setting flag 'R'
Setting flag 'b'
Setting flag 'v'
Setting flag 'z'


In [8]:
# Check the list of flag again
f.list_flags()

['A', 'M', 'R', 'b', 'v', 'z']


In [9]:
# Unset an existing flag, also working for a list of flags ['A', 'B']
f.unset_flags('R')

Unseting flag 'R'


In [10]:
# Try to unset a non-existing flag
f.unset_flags('28')

Flag '28' not in the list of active flags


In [11]:
f.list_flags()

['A', 'M', 'b', 'v', 'z']


### Check the bytes and bits values

In [12]:
# Check the bytes value which will be stored in the DB
f.dbbytes

u'$\xc0\x82\x80\x81\x80\x80\x82\x90'

In [13]:
# Check the bits values, all bytes have their first bit set to 1 (from left to right), only 7 used to store info
print f.bits

110000001000001010000000100000011000000010000000100000101001


In [14]:
# Number of bytes currently used (up to 40 available, minus 1 since dbbytes always starts by "$"
print "%i bytes used" % (len(f.bits) / 8)

7 bytes used


In [15]:
# Remove the 'z' flag
f.unset_flags('z')

Unseting flag 'z'


In [16]:
f.bits

'1100000010000010100000001000000110000000100000001000001'

In [17]:
f.dbbytes

u'$\xc0\x82\x80\x81\x80\x80\x82'

In [18]:
print "%i bytes used" % (len(f.bits) / 8)

6 bytes used


In [19]:
# check if a given flag is set
f.is_set('M')

True

In [20]:
f.is_set('T')

False

In [21]:
f.active_flags

['A', 'M', 'b', 'v']

In [22]:
# print the current status of a flag instance
print f

Initial flag info:
  Init Int value: 0
Current flag info
  Int: 18296979457130755
  Bits value: 1100000010000010100000001000000110000000100000001000001
  Active flags:
   - 'A'
   - 'M'
   - 'b'
   - 'v'



### Query a flag

In [23]:
# let's say that a process has the QualityS value created above
QualityS = f.dbbytes
QualityS

u'$\xc0\x82\x80\x81\x80\x80\x82'

In [24]:
# Now let's make a simple query (regex) to check for the presence of a given flag
LR.query_flag(QualityS, 'A', test=True)

True

In [25]:
LR.query_flag(QualityS, 'R', test=True)

False

In [26]:
LR.query_flag_in_db('A', test=True)

[526906600126808060013003]

In [27]:
LR.query_flag_in_db('B', test=True)

[]

In [28]:
p = LR.query_flag_in_db('A', test=True)[0]
p.QualityS

u'$\xcd\xa3\x88\x84\x80\x88\x80\xc0'

A Flag instance can also be created from a QualityS value obtained from the DB, and modified as shown previoulsy before being saved into the DB. This mode will be used to transfer info from a parent process to a child one, whereas the previous example will be used when new process have no parent already flagged.


In [29]:
fp = LR.Flag(p.QualityS, test=True)
fp.list_flags()

['A', 'D', 'E', 'G', 'I', 'M', 'N', 'R', 'Z', 'm', 'x']


In [30]:
print fp

Initial flag info:
  Init Int value: 216472953637946803
Current flag info
  Int: 216472953637946803
  Bits value: 1100110110100011100010001000010010000000100010001000000011000000
  Active flags:
   - 'A'
   - 'D'
   - 'E'
   - 'G'
   - 'I'
   - 'M'
   - 'N'
   - 'R'
   - 'Z'
   - 'm'
   - 'x'



In [31]:
fp.set_flags('B')
fp.active_flags

Setting flag 'B'


['A', 'B', 'D', 'E', 'G', 'I', 'M', 'N', 'R', 'Z', 'm', 'x']

In [32]:
fp.unset_flags('D')
fp.active_flags

Unseting flag 'D'


['A', 'B', 'E', 'G', 'I', 'M', 'N', 'R', 'Z', 'm', 'x']

What we will acutally save into the DB is the dbbytes value, starting with a '$' and having all the firts bits of all bytes set to 1.

In [33]:
fp.dbbytes

u'$\xe5\xa3\x88\x84\x80\x88\x80\xc0'

In [34]:
[fp.bits[i*8:(i+1)*8] for i in range(len(fp.bits)/8+bool(len(fp.bits)%8))]

['11100101',
 '10100011',
 '10001000',
 '10000100',
 '10000000',
 '10001000',
 '10000000',
 '11']