# Fine-Tuning LLM to Generate TextFSM Templates

This is a basic proof of concept for fine-tuning a large language model (LLM) to generate TextFSM templates based on raw text and the expected output.

The [ntc-templates](https://github.com/networktocode/ntc-templates) repository provides a collection of TextFSM templates, along with unit tests that include raw data and expected outputs. These resources serve as the foundation for this fine-tuning process.

Low-Rank Adaptation (LoRA) is used during fine-tuning to reduce memory consumption.

# Preparing the data

Dowload and process the data

In [50]:
!mkdir -p ~/.kaggle/ && mv kaggle.json ~/.kaggle/ && chmod 600 ~/.kaggle/kaggle.json

mv: cannot stat 'kaggle.json': No such file or directory


In [51]:
!wget https://github.com/networktocode/ntc-templates/archive/refs/heads/master.zip -O master.zip

--2024-09-09 21:14:29--  https://github.com/networktocode/ntc-templates/archive/refs/heads/master.zip
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/networktocode/ntc-templates/zip/refs/heads/master [following]
--2024-09-09 21:14:29--  https://codeload.github.com/networktocode/ntc-templates/zip/refs/heads/master
Resolving codeload.github.com (codeload.github.com)... 140.82.112.10
Connecting to codeload.github.com (codeload.github.com)|140.82.112.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘master.zip’

master.zip              [  <=>               ]   2.99M  10.6MB/s    in 0.3s    

2024-09-09 21:14:30 (10.6 MB/s) - ‘master.zip’ saved [3135747]



In [52]:
!unzip -oq master.zip

In [53]:
from pathlib import Path
test_data = Path("ntc-templates-master/tests")
template_data = Path("ntc-templates-master/ntc_templates/templates")

In [54]:
def get_data():
  for x in test_data.rglob("*.raw"):
    try:
      with x.open() as fp:
        raw = fp.read()
      with next(x.parent.glob(f"{x.stem}.y*l")).open() as fp:
        fsm_data = fp.read()

      with (template_data / f"{x.parent.parent.name}_{x.parent.name}.textfsm").open() as fp:
        fsm_template = fp.read()
      vendor = " ".join(x.parent.parent.name.split("_"))
      cmd = " ".join(x.parent.name.split("_"))
      yield dict(raw=raw, fsm_data=fsm_data, fsm_template=fsm_template, vendor=vendor, cmd=cmd)
    except Exception as exc:
      ... # One of the files not found.

In [55]:
!pip install pandas



In [56]:
import pandas as pd

In [57]:
df = pd.DataFrame(get_data())

In [58]:
df

Unnamed: 0,raw,fsm_data,fsm_template,vendor,cmd
0,IPv4 Dest-Routes for <untrust-vr> (0 entries)\...,"---\nparsed_sample:\n - best: ""*""\n id: ""4...",Value Filldown VR (\S+)\nValue BEST (\S+)\nVal...,juniper screenos,get route
1,VPN IF NAME IP MAC ...,"---\nparsed_sample:\n - vpn: ""0""\n name: ""...",Value VPN (\d+)\nValue NAME (\S+)\nValue IP_AD...,cisco viptela,show arp
2,\n IF\nVPN NAME IP MAC ...,"---\nparsed_sample:\n - vpn: ""0""\n name: ""...",Value VPN (\d+)\nValue NAME (\S+)\nValue IP_AD...,cisco viptela,show arp
3,\n IF ...,"---\nparsed_sample:\n- vpn: ""0""\n interface: ...",Value VPN (\d+)\nValue INTERFACE (\S+)\nValue ...,cisco viptela,show interface
4,"Capability codes: (R) Router, (B) Bridge, (O) ...","---\nparsed_sample:\n - capabilities: ""BR""\n ...",Value LOCAL_INTERFACE (\S+)\nValue NEIGHBOR_NA...,ipinfusion ocnos,show lldp table
...,...,...,...,...,...
1468,Domain name example.com\n,"---\nparsed_sample:\n - domainname: ""example....",Value DOMAINNAME (\S+)\n\nStart\n ^Domain nam...,checkpoint gaia,show domainname
1469,Virtual systems list\nVS ID VS NAME\n0 ...,"---\nparsed_sample:\n - instance_id: ""0""\n ...",Value INSTANCE_ID (\d+)\nValue INSTANCE_NAME (...,checkpoint gaia,show virtual-system all
1470,Interface Mgmt\n state on\n mac-addr 00:...,"---\nparsed_sample:\n - autoneg: ""on""\n co...",Value INTERFACE (\S+)\nValue STATE (\w+)\nValu...,checkpoint gaia,show interfaces all
1471,Interface Mgmt\n state on\n mac-addr 00:...,"---\nparsed_sample:\n - autoneg: ""on""\n co...",Value INTERFACE (\S+)\nValue STATE (\w+)\nValu...,checkpoint gaia,show interfaces all


## Create the prompt

In [59]:
prompt_template = """You are a powerful text-to-TextFSM model. Your job is generate TextFSM templates to extract data from semi structured text. You are given a example text and the expected structured data.

You must output the TextFSM template that extracts the expected structured data.

# Text:
```
{raw}
```

# Expected Data:
```
{fsm_data}
```

# Response:
```
{fsm_template}
```
"""

In [60]:
df["prompt"] = df.apply(lambda row: prompt_template.format(raw=row["raw"], fsm_data=row["fsm_data"], fsm_template=row["fsm_template"]), axis=1)

In [61]:
df

Unnamed: 0,raw,fsm_data,fsm_template,vendor,cmd,prompt
0,IPv4 Dest-Routes for <untrust-vr> (0 entries)\...,"---\nparsed_sample:\n - best: ""*""\n id: ""4...",Value Filldown VR (\S+)\nValue BEST (\S+)\nVal...,juniper screenos,get route,You are a powerful text-to-TextFSM model. Your...
1,VPN IF NAME IP MAC ...,"---\nparsed_sample:\n - vpn: ""0""\n name: ""...",Value VPN (\d+)\nValue NAME (\S+)\nValue IP_AD...,cisco viptela,show arp,You are a powerful text-to-TextFSM model. Your...
2,\n IF\nVPN NAME IP MAC ...,"---\nparsed_sample:\n - vpn: ""0""\n name: ""...",Value VPN (\d+)\nValue NAME (\S+)\nValue IP_AD...,cisco viptela,show arp,You are a powerful text-to-TextFSM model. Your...
3,\n IF ...,"---\nparsed_sample:\n- vpn: ""0""\n interface: ...",Value VPN (\d+)\nValue INTERFACE (\S+)\nValue ...,cisco viptela,show interface,You are a powerful text-to-TextFSM model. Your...
4,"Capability codes: (R) Router, (B) Bridge, (O) ...","---\nparsed_sample:\n - capabilities: ""BR""\n ...",Value LOCAL_INTERFACE (\S+)\nValue NEIGHBOR_NA...,ipinfusion ocnos,show lldp table,You are a powerful text-to-TextFSM model. Your...
...,...,...,...,...,...,...
1468,Domain name example.com\n,"---\nparsed_sample:\n - domainname: ""example....",Value DOMAINNAME (\S+)\n\nStart\n ^Domain nam...,checkpoint gaia,show domainname,You are a powerful text-to-TextFSM model. Your...
1469,Virtual systems list\nVS ID VS NAME\n0 ...,"---\nparsed_sample:\n - instance_id: ""0""\n ...",Value INSTANCE_ID (\d+)\nValue INSTANCE_NAME (...,checkpoint gaia,show virtual-system all,You are a powerful text-to-TextFSM model. Your...
1470,Interface Mgmt\n state on\n mac-addr 00:...,"---\nparsed_sample:\n - autoneg: ""on""\n co...",Value INTERFACE (\S+)\nValue STATE (\w+)\nValu...,checkpoint gaia,show interfaces all,You are a powerful text-to-TextFSM model. Your...
1471,Interface Mgmt\n state on\n mac-addr 00:...,"---\nparsed_sample:\n - autoneg: ""on""\n co...",Value INTERFACE (\S+)\nValue STATE (\w+)\nValu...,checkpoint gaia,show interfaces all,You are a powerful text-to-TextFSM model. Your...


## Shuffel the dataset and take the first 1200 for training

In [62]:
df = df.sample(frac=1).reset_index(drop=True)

In [63]:
df

Unnamed: 0,raw,fsm_data,fsm_template,vendor,cmd,prompt
0,\nEnqueued Dequeued ID ...,"---\nparsed_sample:\n - completed: ""10:20:11""...",Value ENQUEUED (\S+\s+\S+?)\nValue DEQUEUED (\...,paloalto panos,show jobs all,You are a powerful text-to-TextFSM model. Your...
1,Vlan650 - Group 650 (HSRP-V2) (IPv4)\n Local ...,"---\nparsed_sample:\n - active_expire: """"\n ...",# Object names are based on pyATS/Genie parser...,cisco nxos,show hsrp all,You are a powerful text-to-TextFSM model. Your...
2,Huawei Versatile Routing Platform Software\nVR...,"---\nparsed_sample:\n - vrp_version: ""5.170""\...",Value VRP_VERSION (\S+)\nValue PRODUCT_VERSION...,huawei vrp,display version,You are a powerful text-to-TextFSM model. Your...
3,SEQ HOST ...,"---\nparsed_sample:\n - seq: [""0"", ""1"", ""2"", ...",Value List SEQ (\d+)\nValue List HOST (\S+)\nV...,mikrotik routeros,ping,You are a powerful text-to-TextFSM model. Your...
4,"Fri Aug 9 09:08:06.986 CDT\nNAME: ""module 0/R...","---\nparsed_sample:\n - descr: ""ASR9K Route S...",Value Required NAME (.*?)\nValue DESCR (.*?)\n...,cisco xr,admin show inventory,You are a powerful text-to-TextFSM model. Your...
...,...,...,...,...,...,...
1468,\n Status and Counters - General System Inform...,"---\nparsed_sample:\n - allow_mods: ""Yes""\n ...",Value NAME (\S+)\nValue CONTACT (.+)\nValue LO...,hp procurve,show system,You are a powerful text-to-TextFSM model. Your...
1469,0 A S dst-address=0.0.0.0/0 gateway=192.168....,"---\nparsed_sample:\n - check_gateway: """"\n ...",Value INDEX (\d+)\nValue FLAGS ([XADCSrbomBUP ...,mikrotik routeros,ip route print terse without-paging,You are a powerful text-to-TextFSM model. Your...
1470,\nClean Air Solution.............................,"---\nparsed_sample:\n - air_quality_alarm: ""E...",Value CLEANAIR (.+?)\nValue AIR_QUALITY_REPORT...,cisco wlc ssh,show 802.11a cleanair config,You are a powerful text-to-TextFSM model. Your...
1471,"Syslog logging: enabled (0 messages dropped, 3...","---\nparsed_sample:\n - number: """"\n month...",Value NUMBER (\d+)\nValue MONTH (\S+)\nValue D...,cisco ios,show logging,You are a powerful text-to-TextFSM model. Your...


In [64]:
data = df.prompt.tolist()[:1200]

In [65]:
from IPython.display import display, Markdown

display(Markdown(data[0]))

You are a powerful text-to-TextFSM model. Your job is generate TextFSM templates to extract data from semi structured text. You are given a example text and the expected structured data.

You must output the TextFSM template that extracts the expected structured data.

# Text:
```

Enqueued              Dequeued           ID  PositionInQ                              Type                         Status Result Completed
------------------------------------------------------------------------------------------------------------------------------------------
2017/02/28 10:19:48   10:19:48            7                                    FqdnRefresh                            FIN     OK 10:20:11
2017/02/28 10:13:49   10:13:49            6                                    FqdnRefresh                            FIN     OK 10:14:21
2017/02/28 10:13:22   10:13:22            5                                         Commit                            FIN     OK 10:13:49
2017/02/27 12:06:50   12:06:50            4                                         Commit                            FIN     OK 12:07:18
2017/02/27 12:02:54   12:02:54            3                                         Commit                            FIN     OK 12:03:20
2017/02/27 11:55:15   11:55:15            2                                         Commit                            FIN     OK 11:55:42
2017/02/23 08:31:14   08:31:14            1                                        AutoCom                            FIN     OK 08:32:06



```

# Expected Data:
```
---
parsed_sample:
  - completed: "10:20:11"
    dequeued: "10:19:48"
    enqueued: "2017/02/28 10:19:48"
    id: "7"
    result: "OK"
    status: "FIN"
    type: "FqdnRefresh"
  - completed: "10:14:21"
    dequeued: "10:13:49"
    enqueued: "2017/02/28 10:13:49"
    id: "6"
    result: "OK"
    status: "FIN"
    type: "FqdnRefresh"
  - completed: "10:13:49"
    dequeued: "10:13:22"
    enqueued: "2017/02/28 10:13:22"
    id: "5"
    result: "OK"
    status: "FIN"
    type: "Commit"
  - completed: "12:07:18"
    dequeued: "12:06:50"
    enqueued: "2017/02/27 12:06:50"
    id: "4"
    result: "OK"
    status: "FIN"
    type: "Commit"
  - completed: "12:03:20"
    dequeued: "12:02:54"
    enqueued: "2017/02/27 12:02:54"
    id: "3"
    result: "OK"
    status: "FIN"
    type: "Commit"
  - completed: "11:55:42"
    dequeued: "11:55:15"
    enqueued: "2017/02/27 11:55:15"
    id: "2"
    result: "OK"
    status: "FIN"
    type: "Commit"
  - completed: "08:32:06"
    dequeued: "08:31:14"
    enqueued: "2017/02/23 08:31:14"
    id: "1"
    result: "OK"
    status: "FIN"
    type: "AutoCom"

```

# Response:
```
Value ENQUEUED (\S+\s+\S+?)
Value DEQUEUED (\S+)
Value ID (\d+)
Value TYPE (\w+)
Value STATUS (\w+)
Value RESULT (\w+)
Value COMPLETED (\S+)

Start
#  ^${ENQUEUED}\s+\S+\s+${ID}\s+${TYPE}\s+${STATUS}\s+${RESULT}\s+${COMPLETED} -> Record
  ^${ENQUEUED}\s+${ID}\s+${TYPE}\s+${STATUS}\s+${RESULT}\s+${COMPLETED} -> Record
  ^${ENQUEUED}\s+${DEQUEUED}\s+${ID}\s+${TYPE}\s+${STATUS}\s+${RESULT}\s+${COMPLETED} -> Record

```


# Build LLM

In [66]:
!pip install -q -U keras-nlp
!pip install -q -U keras>=3

In [67]:
import os

os.environ["KERAS_BACKEND"] = "jax"  # Or "torch" or "tensorflow".
# Avoid memory fragmentation on JAX backend.
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00"

import keras
import keras_nlp

In [68]:
class Settings:
    base_model = "gemma_instruct_2b_en" # base LLM
    rank = 4  # LoRA Rank
    sequence_length = 2048 # max input size
    batch_size = 1 # depending on GPUs
    epochs = 4 # try more epochs

Provide the Kaggle API token and accept the Gemma license on Kaggle to download the model.

In [69]:
%%time
# Use 2b to be able to use free Colab or Kaggle Notebook, would be nice to try 7b model

gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset(Settings.base_model)

CPU times: user 4.9 s, sys: 6.33 s, total: 11.2 s
Wall time: 12.9 s


In [70]:
gemma_lm.summary()

## Test Base LLM

In [71]:
%%time
display(Markdown(gemma_lm.generate("What is TextFSM in the context of Network Automation?", max_length=1024)))

What is TextFSM in the context of Network Automation?

TextFSM is a software tool that can be used to automate the configuration and management of network devices and services. It is a powerful tool that can be used to streamline the network configuration process and to ensure that all devices and services are configured correctly.

**Key features of TextFSM include:**

* **Device and service discovery:** TextFSM can automatically discover the network devices and services that are available.
* **Configuration management:** TextFSM can be used to manage the configuration of network devices and services.
* **Troubleshooting:** TextFSM can be used to troubleshoot network issues.
* **Reporting:** TextFSM can generate reports on the status of the network.

TextFSM is a popular choice for network automation due to its ease of use and its powerful features. It is a valuable tool for any network administrator who wants to streamline the network configuration process and to ensure that all devices and services are configured correctly.

CPU times: user 10.6 s, sys: 214 ms, total: 10.9 s
Wall time: 10.1 s


In [72]:
%%time
demo_prompt = """You are a powerful text-to-TextFSM model. Your job is generate TextFSM templates to extract data from semi structured text. You are given a example text and the expected structured data.

You must output the TextFSM template that extracts the expected structured data.

# Text:
```
18:42:41.321 PST Sun Feb 8 2009
12:18:42.123 CET Sun Feb 14 2021
08:15:00.0 PST Mon Okt 31 2020
```

# Expected Data:
```
[
  {
    "Year": "2009",
    "MonthDay": "8",
    "Month": "Feb",
    "Timezone": "PST",
    "Time": "18:42:41"
  },
  {
    "Year": "2021",
    "MonthDay": "14",
    "Month": "Feb",
    "Timezone": "CET",
    "Time": "12:18:42"
  },
  {
    "Year": "2020",
    "MonthDay": "31",
    "Month": "Okt",
    "Timezone": "PST",
    "Time": "08:15:00"
  }
]
```
"""
display(Markdown(gemma_lm.generate(demo_prompt, max_length=1024)))

You are a powerful text-to-TextFSM model. Your job is generate TextFSM templates to extract data from semi structured text. You are given a example text and the expected structured data.

You must output the TextFSM template that extracts the expected structured data.

# Text:
```
18:42:41.321 PST Sun Feb 8 2009
12:18:42.123 CET Sun Feb 14 2021
08:15:00.0 PST Mon Okt 31 2020
```

# Expected Data:
```
[
  {
    "Year": "2009",
    "MonthDay": "8",
    "Month": "Feb",
    "Timezone": "PST",
    "Time": "18:42:41"
  },
  {
    "Year": "2021",
    "MonthDay": "14",
    "Month": "Feb",
    "Timezone": "CET",
    "Time": "12:18:42"
  },
  {
    "Year": "2020",
    "MonthDay": "31",
    "Month": "Okt",
    "Timezone": "PST",
    "Time": "08:15:00"
  }
]
```
**TextFSM Template:**
```
(Timestamp) ([Year] ([Month] ([Day])) ([Timezone]) ([Time])
```

CPU times: user 338 ms, sys: 0 ns, total: 338 ms
Wall time: 335 ms


## Fine-tuning with Low-Rank Adaptation (LoRA)

In [73]:
gemma_lm.backbone.enable_lora(rank=Settings.rank)
gemma_lm.summary()

In [74]:
# Set sequence length to control memory usage
gemma_lm.preprocessor.sequence_length = Settings.sequence_length

# Use AdamW (Adam with Weight Decay)
optimizer = keras.optimizers.AdamW(
    learning_rate=5e-5,  # Try higher learning rates like 1e-4, 2e-4 or 2e-5
    weight_decay=0.01,
    beta_1=0.9,
    beta_2=0.999
)

optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])

gemma_lm.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()]
)

In [75]:
gemma_lm.fit(data, epochs=Settings.epochs, batch_size=Settings.batch_size)

Epoch 1/4
[1m1200/1200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m348s[0m 282ms/step - loss: 0.6812 - sparse_categorical_accuracy: 0.8045
Epoch 2/4
[1m1200/1200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m338s[0m 275ms/step - loss: 0.4932 - sparse_categorical_accuracy: 0.8419
Epoch 3/4
[1m1200/1200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m331s[0m 275ms/step - loss: 0.4671 - sparse_categorical_accuracy: 0.8481
Epoch 4/4
[1m1200/1200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m331s[0m 276ms/step - loss: 0.4483 - sparse_categorical_accuracy: 0.8529


<keras.src.callbacks.history.History at 0x7e01cc5973a0>

In [76]:
gemma_lm.save("textfsmLLM.keras")

## Test new LLM

In [77]:
print(demo_prompt)

You are a powerful text-to-TextFSM model. Your job is generate TextFSM templates to extract data from semi structured text. You are given a example text and the expected structured data.

You must output the TextFSM template that extracts the expected structured data.

# Text:
```
18:42:41.321 PST Sun Feb 8 2009
12:18:42.123 CET Sun Feb 14 2021
08:15:00.0 PST Mon Okt 31 2020
```

# Expected Data:
```
[
  {
    "Year": "2009",
    "MonthDay": "8",
    "Month": "Feb",
    "Timezone": "PST",
    "Time": "18:42:41"
  },
  {
    "Year": "2021",
    "MonthDay": "14",
    "Month": "Feb",
    "Timezone": "CET",
    "Time": "12:18:42"
  },
  {
    "Year": "2020",
    "MonthDay": "31",
    "Month": "Okt",
    "Timezone": "PST",
    "Time": "08:15:00"
  }
]
```



In [78]:
%%time
display(Markdown(gemma_lm.generate(demo_prompt, max_length=1024)))

You are a powerful text-to-TextFSM model. Your job is generate TextFSM templates to extract data from semi structured text. You are given a example text and the expected structured data.

You must output the TextFSM template that extracts the expected structured data.

# Text:
```
18:42:41.321 PST Sun Feb 8 2009
12:18:42.123 CET Sun Feb 14 2021
08:15:00.0 PST Mon Okt 31 2020
```

# Expected Data:
```
[
  {
    "Year": "2009",
    "MonthDay": "8",
    "Month": "Feb",
    "Timezone": "PST",
    "Time": "18:42:41"
  },
  {
    "Year": "2021",
    "MonthDay": "14",
    "Month": "Feb",
    "Timezone": "CET",
    "Time": "12:18:42"
  },
  {
    "Year": "2020",
    "MonthDay": "31",
    "Month": "Okt",
    "Timezone": "PST",
    "Time": "08:15:00"
  }
]
```
# Response:
```
Value YEAR (\d{4})
Value MONTHDAY (\d{2})
Value MONTH (\w{3})
Value TIMEZONE (\S{2})
Value TIME (\S{4})

Start
  ^${YEAR}\:${MONTHDAY}\s+${TIMEZONE}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTH}\s+${TIMEZONE}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIMEZONE}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIMEZONE}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  ^${YEAR}\:${MONTHDAY}\s+${TIME}\s*$$ -> Record
  

CPU times: user 16.2 s, sys: 140 ms, total: 16.3 s
Wall time: 15.6 s


# ToDo

This is just a basic first attempt, and the model still needs extensive validation and optimization.