-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Problem statement
Once for a while, we might run into the situation that a new parameter needs to be introduced. Before we jump into the water, let's first understand what kind of parameters can be welcomed coz overall, parser has to put the data into the database and that database has a schema that only 2 types of data are acceptable:
- Measurement with a timestamp, such as:
Water temperature at surface: 16 Celsius occurred at 1503980000 - Measurement with a timestamp and depth, such as:
Water temperature under water: 6 Celsius at 4 meter occurred at 1503980000
Here is an example of GLOS OBS xml file, which happens to have both types of data we mentioned above available:
<?xml version="1.0" encoding="ISO-8859-1"?>
<message>
<station>45166</station>
<date>09/09/2017 10:15:00</date>
<met>
<vbat>11.76</vbat>
<wspd1>4.244</wspd1>
<gust1>5.86</gust1>
<wdir1>317.7</wdir1>
<atmp1>13.32</atmp1>
<rrh>77.28</rrh>
<dewpt1>9.439999</dewpt1>
<wtmp1>19.31</wtmp1>
<sbar1>1018.543</sbar1>
<wvhgt>0.1663333</wvhgt>
<nfdompd>1.418</nfdompd>
<dompd>1.481</dompd>
<mwdir>335.1</mwdir>
<a1>0.119</a1>
<b1>-0.055</b1>
<a2>0.264</a2>
<b2>-0.007</b2>
<Age>333</Age>
<fm64ii>820</fm64ii>
<fm64xx>99</fm64xx>
<fm64k1>7</fm64k1>
<fm64k2>0</fm64k2>
<fm64k3>3</fm64k3>
<fm64k4>9</fm64k4>
<fm64k5>5</fm64k5>
<fm64k6>4</fm64k6>
<dp001>1</dp001>
<tp001>19.31</tp001>
<dp002>3</dp002>
<tp002>19.26</tp002>
<dp003>4</dp003>
<tp003>19.17</tp003>
<dp004>5</dp004>
<tp004>19.17</tp004>
<dp005>6</dp005>
<tp005>19.17</tp005>
<dp006>7</dp006>
<tp006>19.17</tp006>
<dp007>8</dp007>
<tp007>19.29</tp007>
<dp008>9</dp008>
<tp008>19.18</tp008>
<dp009>10</dp009>
<tp009>19.16</tp009>
<dp010>11</dp010>
<tp010>19.21</tp010>
<dp011>12</dp011>
<tp011>19.18</tp011>
<dp012>13</dp012>
<tp012>19.17</tp012>
<dp013>14</dp013>
<tp013>19.55</tp013>
<dp014>15</dp014>
<tp014>19.26</tp014>
<dp015>16</dp015>
<tp015>19.49</tp015>
<dp016>17</dp016>
<tp016>19.34</tp016>
<dv001>0.5</dv001>
<uv001>-11.78965</uv001>
<vv001>-0.4941349</vv001>
<dv002>1</dv002>
<uv002>-11.17963</uv002>
<vv002>-4.884262</vv002>
<dv003>1.5</dv003>
<uv003>-5.343738</uv003>
<vv003>-7.117194</vv003>
<dv004>2</dv004>
<uv004>-25.02598</uv004>
<vv004>-8.714965</vv004>
<dv005>2.5</dv005>
<uv005>-33.17368</uv005>
<vv005>29.6607</vv005>
<dv006>3</dv006>
<dv007>3.5</dv007>
<uv007>-23.88369</uv007>
<vv007>12.59244</vv007>
<dv008>4</dv008>
<uv008>-9.657272</uv008>
<vv008>-12.6312</vv008>
<dv009>4.5</dv009>
<uv009>-15.87828</uv009>
<vv009>-8.478215</vv009>
<dv010>5</dv010>
<uv010>-6.829208</uv010>
<vv010>-5.549949</vv010>
</met>
</message>
For detail of this GLOS OBS XML format, please refer to the documentation. However, I'd like to address several points here that could be beneficial to carry out the explanation on source codes for the parser momentarily.
For the Measurement with a timestamp in the xml file, it could look like:
<gust1>5.86</gust1>Tag gust1 is the name for wind gust based upon the naming convention defined in GLOS OBS XML format. 5.86 is the value of this measurement.
<dp001>1</dp001>
<tp001>19.31</tp001>
<dp002>3</dp002>
<tp002>19.26</tp002>
<dp003>4</dp003>
<tp003>19.17</tp003>
<dp004>5</dp004>
<tp004>19.17</tp004>
<dp005>6</dp005>
<tp005>19.17</tp005>
<dp006>7</dp006>
<tp006>19.17</tp006>
<dp007>8</dp007>
<tp007>19.29</tp007>
<dp008>9</dp008>
<tp008>19.18</tp008>
<dp009>10</dp009>
<tp009>19.16</tp009>
<dp010>11</dp010>
<tp010>19.21</tp010>
<dp011>12</dp011>
<tp011>19.18</tp011>
<dp012>13</dp012>
<tp012>19.17</tp012>
<dp013>14</dp013>
<tp013>19.55</tp013>
<dp014>15</dp014>
<tp014>19.26</tp014>
<dp015>16</dp015>
<tp015>19.49</tp015>
<dp016>17</dp016>
<tp016>19.34</tp016>The above xml piece contains multiple tags with name starting with either tp or dp and these two letters are followed by three digits with left-padding zero. tp001 is for water temperature recorded at node 001, dp001 is the depth from the water surface for node 001. So, in other words, tp and dp are the names of the measurement and they are paired using the sequencing number, which is three digits.
OK, now is the time for revealing the code. Talk is always cheap, show me the FXXX codes :)
Implementations
The original Java code was probably compiled against JRE 1.5 or 1.6, which means you may run into some compatibility issues if you'd like to use JRE 8 or 9. So, sit tight, we are going to see some ancient code here.
The parser for GLOS OBS XML can be found here. The logic is pretty intuitive, we open the file, then read xml, then try to find every tag we can possibly recognize, then read the data. The actual parsing starts at here. Let's start with big picture and then we dive into details if it's really necessary.
For Measurement with a timestamp, below is a code snippet, please pay attention on the inline comments.
//We use org.w3c.dom for parsing XML, nlist is a NodeList
nlist = null;
//See if there is any tag with name gust1, if such a tag/tags exist, we parse it
nlist = message.getElementsByTagName("gust1");
if(nlist! = null && nlist.getLength() > 0) {
try {
//Read the value as Float
val = Float.parseFloat(nlist.item(0).getTextContent());
If we get a missingVal, set the value to Float.NaN
if(val != missingVal)
o.setMaxWindGust(val);
else
o.setMaxWindGust(Float.NaN);
}
catch(NumberFormatException e) {
//If the content of the tag can't be transformed to a number, set the value to Float.NaN
o.setMaxWindGust(Float.NaN);
}
}For Measurement with a timestamp and depth, here is the procedure, well, obviously, it's more complicated but don't worry, just follow the inline comments.
int depthCnt = 1;
//Tag name varies this time
String depthTagFmt = "dp%s%s";
String wtpTagFmt = "tp%s%s";
String dname = "dp001";
String wtpname = "tp001";
String prefix = null;
NodeList dlist, wtplist;
double tdepth, twtemp;
while((dlist = message.getElementsByTagName(dname)) != null&&
(wtplist = message.getElementsByTagName(wtpname)) != null&&
dlist.getLength() > 0 && wtplist.getLength() > 0) {
try {
tdepth = Double.parseDouble(dlist.item(0).getTextContent());
twtemp = Double.parseDouble(wtplist.item(0).getTextContent());
if(twtemp != missingVal) {
/*
This is ugly. I can't recall why I did this. Probably, buoy GVSU can't send water temp
in Celsius? This is still that way???
*/
if(isGVSU)
twtemp = (float)((float)(Math.round((twtemp - 32.0) * 5.0 / 9.0 * 10000)) / 10000.0);
o.setThermalString(tdepth, twtemp, ObsZ.WATER_TEMP);
}
}
catch(NumberFormatException e) {}
//Adjust padding zero if we need to
if(++depthCnt < 10)
prefix = "00";
else if(depthCnt < 100)
prefix="0";
else if(depthCnt > 999)
break;
dname = String.format(depthTagFmt, prefix, depthCnt);
wtpname = String.format(wtpTagFmt, prefix, depthCnt);
}
/*
If we have a collection of thermal string data, we'd like to sort them by the depth. Why?
there is no way we can presume the org.w3c.dom XML parser will read the file sequentially, not
to mention that data provider doesn't have an obligation to sort the data for us.
*/
if(o.getThermalString() != null && o.getThermalString().size() > 0) {
Collections.sort(o.getThermalString(), new Comparator<ObsZ>() {
public int compare(ObsZ o1, ObsZ o2) {
if(o1.depth > o2.depth)
return 1;
else if(o1.depth - o2.depth < 0.0001)
return 0;
else
return -1;
}
});
}If you carefully read the code, you may ask what the hack "o.setMaxWindGust(val)" is. That's a "detail" we are going to cover. Check here for class Observation, which is a pojo. If you don't know pojo, check here. Since pojo is used, it means every time we add a parameter to parser, we have to add getter/setter for it!
For example, say we add wind gust as a new parameter, which was not in class Observation. We would have to add following into the class:
private float maxWindGust=Float.NaN;
public float getMaxWindGust() {
return maxWindGust;
}
public void setMaxWindGust(float maxWindGust) {
this.maxWindGust = maxWindGust;
}Now, we are done for reading, we are going to talk about writing the data to the database. Sorry, it's not automated either. Why? pojo is used......
Code for handling database is here. Again, by adding a new parameter, we have to add some extra code there.
For Measurement with a timestamp, here is an example:
if(!Float.isNaN(result = ob.getDewPoint())) {
if(!(result > 999.89 && result < 999.91)) {
/*
DEWP is the key for Dew Point in glos_obs_settings.properties, the configuration file;
ConfigManager.getDewPntSensorId() and ConfigManager.getDewPntMeasureId() are both
setting in glos_obs_settings.properties for DEWP
*/
Ids = getSensor(pstSelSensor, pstInsSensor, platformId, ConfigManager.getDewPntSensorId(), ConfigManager.getDewPntMeasureId(), "DEWP", alt);
if(result > 50.0 || result < -30.0)
result = FLAGGED;
InsertResult2D(pstInsObs, platformId, Ids[0], ConfigManager.getDewPntMeasureId(), ob.getDate(), lon, lat, alt, result);
}
}For Measurement with a timestamp and depth, luckily, it's not that complicated this time:
if(ob.getThermalString() != null && ob.getThermalString().size() > 0) {
ArrayList<ObsZ> thermals = ob.getThermalString();
for(int i = 0; i < thermals.size(); ++i) {
Ids = getSensor(pstSelSensor, pstInsSensor,
platformId,ConfigManager.getThermalStringSensorId(),
ConfigManager.getThermalStringMeasureId(),
"TTAD",
-9999);
InsertResult3D(pstInsObs, platformId, Ids[0], ConfigManager.getThermalStringMeasureId(), ob.getDate(), lon, lat, alt, (float)thermals.get(i).value,( float)thermals.get(i).depth, i+1);
}
//use one extra row as the place to store the number of nodes in m_value
InsertResult3D(pstInsObs, platformId, Ids[0], ConfigManager.getThermalStringMeasureId(), ob.getDate(), lon, lat, alt, (float)thermals.size(), -9999, -9999);
}Now, again, one is coupling with another. We've cover read and write and now see what's in glos_obs_settings.properties. Here is the relevant section. They are the info we need to cross-reference with database.
#Lookup Ids, DO NOT EDIT! obs_type, m_type
ATMP=5,5
DEWP=42,4
WDIR=3,3
WSPD=1,1
GST=2,2
CCVR=45,6
SRAD=30,7
PRES=4,8
WTMP=6,9
WVHT=13,10
WPRD=43,11
TTAD=46,43
RH1=22,22
CHLORO=10,44
YTURBI=49,45
PH=36,38
SPCOND=7,46
DISOXY=50,47
DIOSAT=51,48
YCHLOR=52,49
YBGALG=53,50You can look up these info in database table obs_type and m_type.
Well, I've written enough. I guess we need another thread for OBS database.