Hypothesis Testing Documentation

Introduction

This document provides an analysis of meteorological data sets containing rainfall measurements from five different points. The goal is to determine if there are statistically significant differences in the rain patterns between these points. The stations are located close to each other, with a few kilometers of distance between them.

Documentation Access

This Project:

Hypothesis Testing Climate

Other Projects:

Libraries Used

The following libraries are imported in the code:

ggplot2
tidyverse
data.table
geosphere
Metrics
rmutil
pathmapping
lubridate
knitr
here

Data Import

The code imports two CSV files containing rain data from different points. The files are read using the read.table function and stored in two data frames, df1 and df2. The file paths are constructed using the here function from the here library.

Data Preprocessing

The code performs various data preprocessing steps to prepare the data for analysis:

Transposing the df1 data frame to convert rows to columns and add correct column headers.
Extracting the coordinates of the point of interest (Point_0) from the transposed data frame.
Removing unnecessary rows and columns from df1 and df2.
Dropping rows with missing values (NA) from df1 and df2.
Calculating the Euclidean and geographical distances from each point to Point_0 using the coordinates.
Creating indexes for convenient slicing and managing the data.

Map Plot

The code generates a map plot using the ggplot2 library to visualize the station settings. The plot shows all stations as blue points and the point of interest (Point_0) as a red point. The title of the plot displays the total area covered by the stations in hectares and the distances in meters.

Descriptive Statistics

Descriptive statistics are calculated for the rainfall data, including the sum, mean, and variance. The statistics are displayed in a table format.