# Inserting and editing records using XML

This notebook provides examples of how to use the xmu library to read and write records to EMu using the XML import format. The XML format better represents the nested structure of EMu records but requires larger files than CSV.

We'll begin by importing the necesarry objects from the xmu library:

In [None]:
from xmu import EMuDate, EMuLatitude, EMuReader, EMuRecord, write_xml

The EMuRecord class is used to parse or create records for EMu. It is based on the native `dict` data type in Python anduses the following conventions to represent records:

- Fields are identified by their backend name, including the Ref suffix for attachments. Backend names can be found in the field help in the client.
- Atomic (single-value) fields can be entered as strings, integers, etc.
- Tables are lists
- Nested tables are list of lists
- Attachments are represented as `dict`s following the same conventions
- Data types that require special handling (like dates, times, and coordiantes) are mapped to custom classes to help prepare the data for EMu

The grouping mechanism in the EMu report feature is not supported (but see below for another approach to grouping related fields.)

Here is an example EMuRecord created from scratch:

In [None]:
rec = EMuRecord({
    "LocCountry": "United States",
    "LocProvinceStateTerritory": "Maine",
    "LocPreciseLocation": "Wales",
    "LatLatitude_nesttab": [["44°10′0″N"]],
    "LatLongitude_nesttab": [["70°3′54″W"]],
    "LatDatum_tab": ["WGS 84 (EPSG:4326)"],
    "ColDateVisitedFrom": "Jan 1970",
    "NteText0": ["API test record"],
    "NteAttributedToRef_nesttab": [[{"NamFirst": "Ima", "NamLast": "Test"}]],
}, "ecollectionevents")

rec

A list of records can be written to XML using `write_xml()`:

In [None]:
write_xml([rec], "import.xml")

XML reports (including the one we just created) can be read using the EMuReader class. Records are streamed from the file, not read in all at once, so very large files can be processed using this class.

In [None]:
reader = EMuReader("import.xml", rec_class=EMuRecord)
for rec in reader:
    display(rec)

Existing records can be updated in much the same way. Any record that include an IRN field is interpreted as an update. The xmu library supports the row operators used by the EMu import feature, including append (+), prepend (-), and replace (1= for the first row, 2= for the second, etc.) When the import file is generated, append rows are grouped automatically based on groups defined in the EMu schema file and that can be customized in the .xmurc config file.

In [None]:
update = EMuRecord({
    "irn": 1234567,
    "LatLatitude_nesttab(+)": [["44 10 N"]],
    "LatLongitude_nesttab(+)": [["70 4 W"]],
    "LatDatum_tab(+)": ["WGS 84"],
}, module="ecollectionevents")

update

Once the updates have been prepared, they can be updated using the samw `write_xml()` function used to insert new records:

In [None]:
write_xml([update], "update.xml")

## Custom data classes

xmu defines custom classes to handle data types that need to be formatted before being ingested into EMu. If a schema file is provided, these classes are applied automatically when the EMuRecord is created. Each data class outputs the data in a format recognized by EMu while keeping as close to the original value as possible. In the example record above, the values for latitude and longitude include symbols for degrees, minutes, and seconds that are stripped when formatting the data for EMu.

In [None]:
dms_lat = EMuLatitude("44°10′0″N")
f"{dms_lat.verbatim} => {dms_lat.emu_str()}"

By contrast, if decimal degrees are provided, the EMuLatitude and EMuLongitude classes will use the decimal format when showing the data:

In [None]:
deg_lat = EMuLatitude(44.1667)
f"{deg_lat.verbatim} => {deg_lat.emu_str()}"

Similarly, the EMuDate class parses common, unambiguous date format into the formats accepted by EMu:

In [None]:
EMuDate("Jan 1970").emu_str()

Some common date formats (like nn/nn/nnnn) are ambiguous but can be parsed by passing the appropriate formatting string to EMuDate. For example, to parse a mm/dd/yyyy date:

In [None]:
EMuDate("01/02/1970", fmt="%m/%d/%Y")