# Parsing Senate Bill Status

> In the current legislative status field as well as the Legislative history section, there are free text strings that depic actions being taken on these bills, their dates, and the potential actors. However, they are not particularly well-formatted, so we'll need to do some custom string parsing to get this to work.


In [None]:
# | default_exp senate.bill_status
# | export
import re
import datetime

from typing import List
from nbdev.showdoc import show_doc

from legisph.senate.models import SenateBill, Senator, SenateCommittee

## Parsing Strategy 

Our strategy is to cycle through a predefined list of classes that all have a `parse()` class method which return parsed statuses (if any) and an indicator as to whether to short circuit the function. 

### Pending In Committee

For example, we first define the most common status for "Pending in committee" as follows:

In [None]:
# | export
class PendingInCommittee(SenateBill.SenateBillStatus):
    name: str = "Pending in Committee"

    @classmethod
    def parse(cls, h):
        if h.item == "Pending in the Committee":
            return (cls(**h.dict()), False)
        return (None, True)

This is able to parse a status and turn it into the relevant subclass. It returns `False` for the second part of the tuple because if it matches exactly, then there is no further action to take.

In [None]:
PendingInCommittee.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7), item="Pending in the Committee"
    )
)

(PendingInCommittee(date=datetime.date(2022, 10, 7), item='Pending in the Committee', name='Pending in Committee'),
 False)

This function below then takes a predefined list of classes and cycles through them, chugging out the parsed actions along the way.

In [None]:
# | export
def parse_senate_bill_status(
    status: SenateBill.SenateBillStatus,  # Senate Bill Status to parse into a subclass
    classes: list,  # List of classes through which to cycle
):
    actions = []
    for c in classes:
        action, cycle = c.parse(status)
        if action is not None:
            actions.append(action)
        if not cycle:
            break
    return actions


show_doc(parse_senate_bill_status)

---

### parse_senate_bill_status

>      parse_senate_bill_status
>                                (status:legisph.senate.models.SenateBill.Senate
>                                BillStatus, classes:list)

|    | **Type** | **Details** |
| -- | -------- | ----------- |
| status | SenateBillStatus | Senate Bill Status to parse into a subclass |
| classes | list | List of classes through which to cycle |

In [None]:
parse_senate_bill_status(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7), item="Pending in the Committee"
    ),
    classes=[PendingInCommittee],
)

[PendingInCommittee(date=datetime.date(2022, 10, 7), item='Pending in the Committee', name='Pending in Committee')]

## Parsing All Statuses

We then proceed to define all the remaining statuses below:


### Joint Proceedings

In [None]:
# | export
class JointProceedings(SenateBill.SenateBillStatus):
    name: str = "Conducted Joint Proceedings"

    @classmethod
    def parse(cls, h):
        if h.item == "Conducted JOINT COMMITTEE MEETINGS/HEARINGS;":
            return (cls(**h.dict()), False)
        return (None, True)

In [None]:
JointProceedings.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Conducted JOINT COMMITTEE MEETINGS/HEARINGS;",
    )
)

(JointProceedings(date=datetime.date(2022, 10, 7), item='Conducted JOINT COMMITTEE MEETINGS/HEARINGS;', name='Conducted Joint Proceedings'),
 False)

### Introduced

In [None]:
# | export
class Introduced(SenateBill.SenateBillStatus):
    name: str = "Introduced by a Senator"
    senator: Senator

    @classmethod
    def parse(cls, h):
        if h.item.startswith("Introduced by Senator "):
            return (
                cls(
                    **h.dict(),
                    senator=Senator(
                        name=(
                            h.item.replace("Introduced by Senator ", "").replace(
                                ";", ""
                            )
                        )
                    )
                ),
                True,
            )
        return (None, True)

In [None]:
Introduced.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Introduced by Senator JINGGOY P. EJERCITO-ESTRADA;",
    )
)

(Introduced(date=datetime.date(2022, 10, 7), item='Introduced by Senator JINGGOY P. EJERCITO-ESTRADA;', name='Introduced by a Senator', senator=Senator(name='JINGGOY P. EJERCITO-ESTRADA')),
 True)

### Committee Report Calendared For Ordinary Business

In [None]:
# | export
class CommitteeReportCalendaredForOrdinaryBusiness(SenateBill.SenateBillStatus):
    name: str = "Committe Report Calendared for Ordinary Business"

    @classmethod
    def parse(cls, h):
        if h.item == "Committee Report Calendared for Ordinary Business;":
            return (cls(**h.dict()), False)
        return (None, True)

In [None]:
CommitteeReportCalendaredForOrdinaryBusiness.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Committee Report Calendared for Ordinary Business;",
    )
)

(CommitteeReportCalendaredForOrdinaryBusiness(date=datetime.date(2022, 10, 7), item='Committee Report Calendared for Ordinary Business;', name='Committe Report Calendared for Ordinary Business'),
 False)

### Consolidated or Substituted in Committee Report

In [None]:
# | export
class ConsolidatedOrSubstitutedInCommitteeReport(SenateBill.SenateBillStatus):
    name: str = "Consolidated or Substituted in Committee Report"

    @classmethod
    def parse(cls, h):
        if h.item == "Consolidated/Substituted in the Committee Report":
            return (cls(**h.dict()), False)
        return (None, True)

In [None]:
ConsolidatedOrSubstitutedInCommitteeReport.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Consolidated/Substituted in the Committee Report",
    )
)

(ConsolidatedOrSubstitutedInCommitteeReport(date=datetime.date(2022, 10, 7), item='Consolidated/Substituted in the Committee Report', name='Consolidated or Substituted in Committee Report'),
 False)

### Technical Working Group

In [None]:
# | export
class TechnicalWorkingGroup(SenateBill.SenateBillStatus):
    name: str = "Conducted a Technical Working Group"

    @classmethod
    def parse(cls, h):
        if h.item == "Conducted TECHNICAL WORKING GROUP;":
            return (cls(**h.dict()), False)
        return (None, True)

In [None]:
TechnicalWorkingGroup.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7), item="Conducted TECHNICAL WORKING GROUP;"
    )
)

(TechnicalWorkingGroup(date=datetime.date(2022, 10, 7), item='Conducted TECHNICAL WORKING GROUP;', name='Conducted a Technical Working Group'),
 False)

### First Reading

In [None]:
# | export
class FirstReading(SenateBill.SenateBillStatus):
    name: str = "Read on First Reading and Referred to Committee"
    committees: List[SenateCommittee]

    @classmethod
    def parse(cls, h):
        slug1 = "Read on First Reading and Referred to the Committee on "
        slug2 = "Read on First Reading and Referred to the Committee(s) on "
        if h.item.startswith(slug1):
            committee = SenateCommittee(name=h.item.replace(slug1, "").replace(";", ""))
            return (cls(**h.dict(), committees=[committee]), False)
        if h.item.startswith(slug2):
            committees = h.item.replace(slug2, "")
            committees = re.split("[ ]*and[ ]*|[ ]*;[ ]*", committees)
            committees = [
                SenateCommittee(name=name) for name in committees if name != ""
            ]
            return (cls(**h.dict(), committees=committees), False)
        return (None, True)

In [None]:
FirstReading.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Read on First Reading and Referred to the Committee on JUSTICE AND HUMAN RIGHTS;",
    )
)

(FirstReading(date=datetime.date(2022, 10, 7), item='Read on First Reading and Referred to the Committee on JUSTICE AND HUMAN RIGHTS;', name='Read on First Reading and Referred to Committee', committees=[SenateCommittee(name='JUSTICE AND HUMAN RIGHTS')]),
 False)

In [None]:
FirstReading.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Read on First Reading and Referred to the Committee(s) on EDUCATION, ARTS AND CULTURE; and FINANCE;",
    )
)

(FirstReading(date=datetime.date(2022, 10, 7), item='Read on First Reading and Referred to the Committee(s) on EDUCATION, ARTS AND CULTURE; and FINANCE;', name='Read on First Reading and Referred to Committee', committees=[SenateCommittee(name='EDUCATION, ARTS AND CULTURE'), SenateCommittee(name='FINANCE')]),
 False)

In [None]:
FirstReading.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Read on First Reading and Referred to the Committee(s) on LOCAL GOVERNMENT; CIVIL SERVICE, GOVERNMENT REORGANIZATION AND PROFESSIONAL REGULATION and FINANCE;",
    )
)

(FirstReading(date=datetime.date(2022, 10, 7), item='Read on First Reading and Referred to the Committee(s) on LOCAL GOVERNMENT; CIVIL SERVICE, GOVERNMENT REORGANIZATION AND PROFESSIONAL REGULATION and FINANCE;', name='Read on First Reading and Referred to Committee', committees=[SenateCommittee(name='LOCAL GOVERNMENT'), SenateCommittee(name='CIVIL SERVICE, GOVERNMENT REORGANIZATION AND PROFESSIONAL REGULATION'), SenateCommittee(name='FINANCE')]),
 False)

### Committee Proceedings

In [None]:
# | export
class CommitteeProceedings(SenateBill.SenateBillStatus):
    name: str = "Conducted Committee Proceedings"

    @classmethod
    def parse(cls, h):
        if h.item == "Conducted COMMITTEE MEETINGS/HEARINGS;":
            return (cls(**h.dict()), False)
        return (None, True)

In [None]:
CommitteeProceedings.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7), item="Conducted COMMITTEE MEETINGS/HEARINGS;"
    )
)

(CommitteeProceedings(date=datetime.date(2022, 10, 7), item='Conducted COMMITTEE MEETINGS/HEARINGS;', name='Conducted Committee Proceedings'),
 False)

### Approved on Second Reading

In [None]:
# | export
class ApprovedOnSecondReading(SenateBill.SenateBillStatus):
    name: str = "Approved On Second Reading"
    with_amendments: bool

    @classmethod
    def parse(cls, h):
        if h.item == "Approved on Second Reading with Amendments;":
            return (cls(**h.dict(), with_amendments=True), False)
        if h.item == "Approved on Second Reading without Amendment;":
            return (cls(**h.dict(), with_amendments=False), False)
        return (None, True)

In [None]:
ApprovedOnSecondReading.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Approved on Second Reading with Amendments;",
    )
)

(ApprovedOnSecondReading(date=datetime.date(2022, 10, 7), item='Approved on Second Reading with Amendments;', name='Approved On Second Reading', with_amendments=True),
 False)

In [None]:
ApprovedOnSecondReading.parse(
    SenateBill.SenateBillStatus(
        date=datetime.date(2022, 10, 7),
        item="Approved on Second Reading without Amendment;",
    )
)

(ApprovedOnSecondReading(date=datetime.date(2022, 10, 7), item='Approved on Second Reading without Amendment;', name='Approved On Second Reading', with_amendments=False),
 False)