# Data input

`pypsv`'s data input is provided as a `pandas` `DataFrame` with specific column names. In the following example we show which columns are necessary and how certain columns are interpreted by the algorithm. We first give an example of a valid input `DataFrame`:

In [None]:
import numpy as np
from pandas import DataFrame

data_input = DataFrame(
    data={
        'D': [20, np.nan],
        'dD': [4.3, np.nan],
        'I': [-21, np.nan],
        'dI': [3.3, np.nan],
        'F': [40, 45],
        'dF': [1.25, 0.7],
        'Age': [879., 1595.0],
        'dAge': [20, np.nan],
        'Age type': ["14C NH", "absolute"],
    },
)

print(data_input)

The presented input consists of two rows. The first row contains direction and intensity data, that is dated by radiocarbon and expected to be calibrated by the Northern Hemisphere calibration curve. The second row contains only intensity data, that is absolutely dated.

### Magnetic field data

Magnetic field data is expected as directions and intensity. The direction columns are `D` for *declination* and `I` for *inclination*. *Intensity* data is expected as an `F` column. Only one of these columns has to be present in the `DataFrame`, but every column present has to be accompanied by a column with *uncertainties*, provided as `dD`, `dI` or `dF` respectively.

### Age data

Age data is accepted in the columns `Age` and `dAge`. For each age, a type has to be given in the `Age type` column. Below, we give the accepted age types and the corresponding interpretations:


| `Age type`    | interpretation            |
| :-            | :-                        |
| `14 C NH`     | Radiocarbon date, that will be calibrated <br> using the Northern Hemisphere curve `IntCal20`. <br> `dAge` is interpreted as one standard deviation of the radiocarbon date. |
| `14 C SH`     | Radiocarbon date, that will be calibrated <br> using the Southern Hemisphere curve `SHCal20`. <br> `dAge` is interpreted as one standard deviation of the radiocarbon date. |
| `14 C MA`     | Radiocarbon date, that will be calibrated <br> using the marine calibration curve `Marine20`. <br> `dAge` is interpreted as one standard deviation of the radiocarbon date. |
| `uniform`     | The age distribution is modelled as a uniform distribution. <br> This can be useful, if the sample is dated historically. <br> `Age` is expected as a calendar year, e.g. `1522` for 1522 CE or `-2412` for 2412 BCE. <br> The uniform distribution extends from `Age - dAge` to `Age + dAge`. |
| `Gaussian`    | The age distribution is modelled as a Gaussian / normal distribution. <br> This can be useful, if the age is precalibrated. <br> `Age` is expected as a calendar year. <br> `dAge` gives the standard deviation of the distribution. |
| `absolute`    | The specimen is absolutely dated, with no uncertainty in the age. <br> `Age` is expected as a calendar year and `dAge` is ignored. |

### Data location

For one PSV curve, all records are expected to stem from the same location. The location is passed to the `PSVCurve` class upon initialization, see the following examples. If the locations differ significantly, a "relocation" may be necessary. See e.g. "The magnetic field of the Earth" by Roland Merrill et al. for further information.

### References

Reimer PJ, Austin WEN, Bard E, et al.: **The IntCal20 Northern Hemisphere  
Radiocarbon Age Calibration Curve (0–55 cal kBP)**. *Radiocarbon*. 2020  
doi:10.1017/RDC.2020.41

Hogg AG, Heaton TJ, Hua Q, et al.: **SHCal20 Southern Hemisphere  
Calibration, 0–55,000 Years cal BP**. *Radiocarbon*. 2020  
doi:10.1017/RDC.2020.59 

Heaton TJ, Köhler P, Butzin M, et al.: **Marine20—The Marine Radiocarbon  
Age Calibration Curve (0–55,000 cal BP)**. *Radiocarbon*. 2020
doi:10.1017/RDC.2020.68 

Merrill, R. T., McElhinny, M. W. & McFadden, P. L.: **The Magnetic Field of the Earth:  
Paleo-magnetism, the core, and the deep mantle**. *San Diego, CA: Academic Press*. 1996