/
Filtering_snow_height.Rmd
112 lines (76 loc) · 8.99 KB
/
Filtering_snow_height.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
title: "Filtering_snow_height.R"
output: github_document
---
# Title: "Filtering_snow_height.R"
## Introduction
The script Filtering_snow_height.R in folder *inst* want to increase quality of data of snow height signal measured with a SR50AT sensor produced by Campbell Scientific ([Link](https://www.campbellsci.com/sr50at-l)). In our data there is some problems, for exmaple some data are out of a phisycal range. To identify these sampling problems we apply some thresholds on range and on increasing/decreasing rate filtering improbable values. After these process the main objective is filtering remaining noise. We apply 2 methods: a moving average filter with a window of 5 hours and a Savitzky-Golay smoothing filter.Documentation of Savitzky-Golay smoothing filter are available online (https://cran.r-project.org/web/packages/signal/signal.pdf)[1]. The main problem of moving average is the smoothing of true snowfall peaks, the Savitzky-Golay smoothing filter seems to exclude this, if we set a small a filter length. In our analysis we observe some strange phenomenous during snow melting. The snow height signal, expecially during warm and sunny days in spring, has a minimum in the middle of afternoon and increase during the night. Our first hypothesis was that the temperature correction applied, as suggest in User Manual, was not enough. The second hypotesys was that the snow react in a different way depending on his status. So the ultrasonic signal penetrates more in melted snow than in fresh or not melted snow. Open issues: how to filter this snow signal?
## Description of script
* **Section 1:** in this section you have to select *git_folder* and *file* of data. This files must contain the column: **Snow_Height**
* **Section 2:** here the script import data and extract snow column using the input section.
* **Section 3:** more realistic data are created in this section using snow depth calibration point. In the file **Snow_Depth_Calibration_file.csv** in folder *data/Snow_Depth_Calibration/* you can insert real snow surveys (snow height under ultrasoni sensor), or virtual snow survey (observing time series and set the value at the dates of end of snow season as 0 cm). With these informations we can calibrate snow height time series. The calibration function apply a linear transformation between 2 snow depth measurements, and a constant transformation from the last snow survey and the end of time series.
* **Section 4:** in this section the algorithm substitute value out of range and value with high increasing/decreasing rate with NA. Data out of phisycal range are considered improbable, as data that change fastly.
* **Section 5:** here the algorithm perform a filtering of snow height signal in two different ways. The first is a moving average filter, which has the problem of smoothing of true snowfall signal. The second is a Savitzky-Golay smoothing filter that seems to work better than moving average expecially setting small value for filter period. The smoothig process can create values with not realistic behaviour, so a rate thresholds as in section 3 is applied on filtered value, as suggested in Comai thesis [2].
At the end of **Section 5** you can select which is your favourite smoothing filter to save in **Section 6**
* **Section 6:** in this section the algorithm save a dataframe cointaining some partial results of filtering snow signal as:
+ *Snow_file.Rdata* in folder *data/Output/Snow_Filtering_RData/* for visualization tool
+ *Snow_file.csv* in folder *data/Output/Snow_Filtering/* for storage results. In this file you can find the original snow height time series, the snow height cleaned using a range threshold,the snow height cleaned using a rete threshold, the snow height calibrated, the snow height filtered with method selected, and the snow height smoothed and checked with rate threshold.
## Datasets
**INPUT:**
1. **git_folder**: source of package **SnowSeasonAnalysis**
2. **file**: name of .csv file to process available in folder */data/Input data*
3. **SNOW_HEIGHT**: the column name of *file* corresponding with snow height parameter. Default (LTER stations) is "Snow_Height"
4. **Range_min_max**: path of file containing range thresholds for every variable. Data out of range min/max are replaced with NA **(don't edit this path)**
5. **Rate_min_max**: path of file containing rate thresholds for every variable. Data out of range max increase/decrease are replaced with NA **(don't edit this path)**
6. **folder_surveys**: path of file containing the snow depth calibration point **(don't edit this path)**. *Snow_Depth_Calibration_file.csv* should be:
+ The first column, called **date**, is the date and time of measurement, and the second, called **snow_height**, is the snow height at the station, as near as possible at the sensor.
+ The **date** should be **hourly**, date format is *YYYY-MM-DD hh:mm* (where mm is 00). Example:
+ Correct: 2017-02-15 15:00
+ Incorrect: 2017-02-15 15:30 (minutes not admitted)
+ The **snow_height** should be in **meters**. You should convert snow surveys from cm to m (divide per 100)
+ Snow surveys should be orderd from the **oldest** to the **most recent**. Remember to reorder snow surveys every times you insert new one. If the snow surveys are not orederd the calibration don't work properly
**METHOD:**
Select in section METHOD the one of the following smoothing filter. Assign to **SMOOTH_METHOD**: "Savitzky_Golay"
1. *"Moving_Avergae"*: apply to data quality checked a 5 hour moving average filter (Parameter setting manually: PERIOD_LENGTH = 5). Data should be filled before (No NA are admitted)
2. *"Savitzky_Golay"*: apply to data quality checked a Savitzky-Golay filter (Parameters setting manually:FILTER_ORDER = 1,FILTER_LENGTH = 9). Data should be filled before (No NA are admitted)
**OUTPUT:**
1. **Snow_file.RData**: in folder *data/Output/Snow_Filtering_RData/* an .RData file which contain a list of zoo time series of:
+ *HS_original*: original data
+ *HS_calibrated*: snow height calibrated
+ *HS_range_QC*: snow height cleaned using a range threshold
+ *HS_rate_QC*: snow height cleaned using a rete threshold
+ *HS_calibr_smoothed*: the snow height filtered with method selected in **SMOOTH_METHOD**
+ *HS_calibr_smooothed_rate_QC*: snow height smoothed and checked with rate threshold
2. **Snow_file.csv**: in folder *data/Output/Snow_Filtering/* a.csv file which contains some time series of snow height and manipulation. The columns are:
+ *TIMESTAMP*: date and time of data
+ *HS_original*: original data
+ *HS_calibrated*: snow height calibrated
+ *HS_range_QC*: snow height cleaned using a range threshold
+ *HS_rate_QC*: snow height cleaned using a rete threshold
+ *HS_calibr_smoothed*: the snow height filtered with method selected in **SMOOTH_METHOD**
+ *HS_calibr_smooothed_rate_QC*: snow height smoothed and checked with rate threshold
Note: in the output names above the algorithm substitute automatically **"file"** with the name of station, setting in **file <- "..."** (INPUT 2)
**NOTE**
Note on file *Snow_Depth_Calibration_FILE.csv*. This file contain all snow surveys and snow calibration point used to calibrate raw data of ultrasonic snow sensor.
1. The first column, called **date**, is the date and time of measurement, and the second, called **snow_height**, is the snow height at the station, as near as possible at the sensor.
2. The **date** should be **hourly**, date format is *YYYY-MM-DD hh:mm* (where mm is 00). Example:
+ Correct: 2017-02-15 15:00
+ Incorrect: 2017-02-15 15:30 (minutes not admitted)
3. The **snow_height** should be in **meters**. You should convert snow surveys from cm to m (divide per 100)
4. Snow surveys should be orderd from the **oldest** to the **most recent**. Remember to reorder snow surveys every times you insert new one. If the snow surveys are not orederd the calibration don't work properly
## How to use
Open script *Filtering_snow_height.R* and:
1. Set **git folder**, the path where the package is download or used.
2. Run **Section 1** (up to **INPUT**) to explore data available in folder *data/Input data*
3. Select the file (containing **Snow_Height**) to process
4. Run **Section 2** to read data
5. Run **Section 3** to calibrate snow height time series
6. Run **Section 4** to perform a quality check based on range and on rate of data
7. Run **Section 5** (up to **INPUT**) to filter signal using moving average filter and Savitzky-Golay smoothing filters
8. Select in **INPUT** the data filtered you want to keep
9. Run the other part of **Section 5**
10. Run **Section 6** to save outputs
## Reference
[1] William H. Press, Saul A. Teukolsky, William T. Vetterling, Brian P. Flannery, Numerical Recipes
in C: The Art of Scientific Computing , 2nd edition, Cambridge Univ. Press, N.Y., 1992.
[2] Comai T. ,Analisi spaziale e temporale delle precipitazioni nevose nelle alpi italiane, Rel. Rigon Riccardo, Università degli Studi di Trento, AA 2013/2014