This document provides an analysis of meteorological data sets containing rainfall measurements from five different points. The goal is to determine if there are statistically significant differences in the rain patterns between these points. The stations are located close to each other, with a few kilometers of distance between them.
The following libraries are imported in the code:
- ggplot2
- tidyverse
- data.table
- geosphere
- Metrics
- rmutil
- pathmapping
- lubridate
- knitr
- here
The code imports two CSV files containing rain data from different points. The files are read using the read.table
function and stored in two data frames, df1
and df2
. The file paths are constructed using the here
function from the here
library.
The code performs various data preprocessing steps to prepare the data for analysis:
- Transposing the
df1
data frame to convert rows to columns and add correct column headers. - Extracting the coordinates of the point of interest (
Point_0
) from the transposed data frame. - Removing unnecessary rows and columns from
df1
anddf2
. - Dropping rows with missing values (NA) from
df1
anddf2
. - Calculating the Euclidean and geographical distances from each point to
Point_0
using the coordinates. - Creating indexes for convenient slicing and managing the data.
The code generates a map plot using the ggplot2
library to visualize the station settings. The plot shows all stations as blue points and the point of interest (Point_0
) as a red point. The title of the plot displays the total area covered by the stations in hectares and the distances in meters.
Descriptive statistics are calculated for the rainfall data, including the sum, mean, and variance. The statistics are displayed in a table format.
The code performs hypothesis testing to evaluate the differences between the rainfall measurements at different stations. The hypothesis being tested is that there are different microclimates within the area.
The code starts with a two-sample t-test between Station.3_01204052 and Station.2_0120593F, comparing the largest and smallest values. The test statistic, degrees of freedom, critical value, p-value, and decision are calculated. A confidence interval is also estimated.
The results of the hypothesis testing, including the test statistic, critical level, probability, p-value, decision, and confidence interval, are displayed in a table format.