# Pulsars

## Introduction
- Pulsars are rotating neutron stars observed to have pulses of radiation at very regular intervals that typically range from milliseconds to seconds. Pulsars have very strong magnetic fields which funnel jets of particles out along the two magnetic poles. These accelerated particles produce very powerful beams of light.
- Some pulsars produce radio emission detectable here on Earth. They are of considerable scientific interest as probes of space-time, the inter-stellar medium, and states of matter

### Dataset
- The data set that we use describes a sample of pulsar candidates collected during the High Time Resolution Universe Survey (South)
- The data set contains the excess kurtosis of the integrated profile, the skewness of the integrated profile, mean of the DM-SNR curve, excess kurtosis of the DM-SNR curve, Skewness of the DM-SNR curve and the *Class* of the star (whether its a pulsar star or not).

In [25]:
library(tidyverse)
library(tidymodels)
library(repr)
library(rvest)
library(stringr)
library(janitor)
# options(repr.matrix.max.rows = 10)

set.seed(100)

In [37]:
pulsar_df <- read_csv("data/HTRU_2.arff", skip = 11, col_names = FALSE) |>
    rename("Profile_mean" = X1,
       "Profile_stdev" = X2,
       "Profile_skewness" = X3,
       "Profile_kurtosis" = X4,
       "DM_mean" = X5, 
       "DM_stdev" = X6, 
       "DM_skewness" = X7,
       "DM_kurtosis" = X8,
       "class" = X9     
      ) |>
    mutate(practice, class = as_factor(class)) |>
    mutate(Class = fct_recode(class, "notpulsar" = "0", "pulsar" = "1")) |>
    select(-Profile_mean, -Profile_stdev, -DM_stdev, -class)
pulsar_df

pulsar_split <- initial_split(pulsar_df, prop = 0.75, strata = Class)
pulsar_train <- training(pulsar_split)
pulsar_test <- testing(pulsar_split)

[1mRows: [22m[34m17898[39m [1mColumns: [22m[34m9[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[32mdbl[39m (9): X1, X2, X3, X4, X5, X6, X7, X8, X9

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


Profile_skewness,Profile_kurtosis,DM_mean,DM_skewness,DM_kurtosis,Class
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>
-0.23457141,-0.6996484,3.199833,7.975532,74.24222,notpulsar
0.46531815,-0.5150879,1.677258,10.576487,127.39358,notpulsar
0.32332837,1.0511644,3.121237,7.735822,63.17191,notpulsar
-0.06841464,-0.6362384,3.642977,6.896499,53.59366,notpulsar
0.60086608,1.1234917,1.178930,14.269573,252.56731,notpulsar
⋮,⋮,⋮,⋮,⋮,⋮
-0.1878456,-0.73812297,1.296823,15.450260,285.931022,notpulsar
0.1279781,0.32306090,16.409699,2.945244,8.297092,notpulsar
0.1593631,-0.74302540,21.430602,2.499517,4.595173,notpulsar
0.2011614,-0.02478884,1.946488,10.007967,134.238910,notpulsar


In [30]:
pulsar_df |>
    distinct(class)
pulsar_df |>
    filter(class == 1) |>
    nrow()
pulsar_df |>
    filter(class == 0) |>
    nrow()
pulsar_df |>
    map_df(())

class
<fct>
0
1


ERROR: [1m[33mError[39m in `n()`:[22m
[1m[22m[33m![39m Must only be used inside data-masking verbs like `mutate()`,
  `filter()`, and `group_by()`.
