# Uranium Dataset Exploratory Analysis (R)

This Jupyter notebook demonstrates how to explore the uranium mining dataset using the R language.  It reads configuration from `config.json` to locate the CSV file, loads the data into a `data.frame`, and performs simple exploratory data analysis using packages like **dplyr** and **ggplot2**.  Each code cell is commented to explain its purpose and the underlying steps.

*Note:* The R kernel must be installed in your Jupyter environment for this notebook to execute.  You can usually install it via the `IRkernel` package.

In [None]:
# Load required libraries
# We use jsonlite to read the configuration,
# dplyr for data manipulation, and ggplot2 for visualisation.
library(jsonlite)
library(dplyr)
library(ggplot2)

# It's a good idea to check that packages are installed.
# If any of these lines error, you may need to install the package
# using install.packages().

In [None]:
# Read configuration from config.json
# This assumes the working directory is the root of the uranium_dataset package.
config <- fromJSON('config.json')

# Determine the full path to the dataset.  The config value may be relative.
dataset_path <- config$dataset_path

# Load the uranium dataset.  We set stringsAsFactors=FALSE to keep character data
# as strings rather than factors (which are less convenient for summarisation).
uranium <- read.csv(dataset_path, stringsAsFactors = FALSE)

# Inspect the first few rows to ensure the data loaded correctly.
head(uranium)

In [None]:
# Summarise the number of records per state
# We use dplyr to group the data by state and count the number of rows.
state_counts <- uranium %>%
    group_by(state) %>%
    summarise(count = n()) %>%
    arrange(desc(count))

# Display the counts.  The head() function shows the top entries.
head(state_counts, 10)

# Plot the counts using ggplot2.  We reorder the states by descending count
# and draw a bar chart.
ggplot(state_counts, aes(x = reorder(state, -count), y = count)) +
    geom_bar(stat = 'identity') +
    theme_minimal() +
    labs(title = 'Uranium mining records by state', x = 'State', y = 'Count') +
    coord_flip()

In [None]:
# Summarise the number of records per deposit type (dep_type)
# Replace missing values with '<missing>' for clarity.
uranium$dep_type_clean <- ifelse(is.na(uranium$dep_type) | uranium$dep_type == '', '<missing>', uranium$dep_type)

dep_counts <- uranium %>%
    group_by(dep_type_clean) %>%
    summarise(count = n()) %>%
    arrange(desc(count))

head(dep_counts, 10)

# Visualise deposit type frequencies for the top categories
top_dep <- dep_counts %>% slice_head(n = 10)
ggplot(top_dep, aes(x = reorder(dep_type_clean, -count), y = count)) +
    geom_bar(stat = 'identity') +
    theme_minimal() +
    labs(title = 'Top deposit types (dep\_type)', x = 'Deposit type', y = 'Count') +
    coord_flip()

In [None]:
# Compute basic numeric summaries for latitude and longitude
# The summary() function returns min, quartiles, median, mean and max.
summary(uranium[, c('latitude', 'longitude')])