# `keep_partial` Tests
**keep_partial is not currently being implemented.**
 
The `keep_partial` parameter in a section definition only applies when a list of subsections is given.  For all other cases the `keep_partial` parameter is ignored.
With a parent section definition that contains a list of subsections, a situation can occur where the parent section or the source ends before all of the subsections have been completed. 
- When `keep_partial=True` is set the final subsection list returned will be missing items for the subsections that did nopt get read.

- When `keep_partial=False` (the default) is set the incomplete final subsection list is not returned.

## Setup

### Imports

In [1]:
from typing import List
from pathlib import Path
from pprint import pprint
import re
import sys

import pandas as pd
import xlwings as xw

from buffered_iterator import BufferedIterator
import text_reader as tp
from sections import Rule, RuleSet, SectionBreak, ProcessingMethods, Section

### Logging

In [2]:
import logging
logging.basicConfig(format='%(name)-20s - %(levelname)s: %(message)s')
#logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger('Two Line SubSection Tests')
logger.setLevel(logging.DEBUG)
#logger.setLevel(logging.INFO)

### Test Text

In [3]:
GENERIC_TEST_TEXT = [
    'Text to be ignored',
    'StartSection A',
    'EndSection A',
    'StartSection B',
    'EndSection B', 
    'More text to be ignored',
    ]

### Subsection Definition

### Combined Start and End subsections Single line section 
***sub_section1***
> - Start *Before* `StartSection`
> - End *Before* `EndSection`

***sub_section2*** 
> - Start *Before* `EndSection`
> - End *Before* ___`True`___ (Always Break)
> - Don't enable testing of first item

`processor=[[sub_section1, sub_section2]]`

In [4]:
start_sub_section = Section(
    section_name='StartSection',
    start_section=SectionBreak('StartSection', break_offset='Before'),
    end_section=SectionBreak('EndSection', break_offset='Before')
    )
end_sub_section = Section(
    section_name='EndSection',
    start_section=SectionBreak('EndSection', break_offset='Before'),
    end_section=SectionBreak(True, break_offset='Before')
    )


## Without Partial Break

In [5]:

full_section = Section(
    section_name='Full',
    processor=[[start_sub_section, end_sub_section]]
    )
pprint(full_section.read(GENERIC_TEST_TEXT))

Sections             - DEBUG: Resetting source for: Full.
Sections             - DEBUG: Starting New Section: Full.
Sections             - DEBUG: Resetting source for: StartSection.
Sections             - DEBUG: Advancing to start of StartSection.
Sections             - DEBUG: In:	Full	Got item:	Text to be ignored
Sections             - DEBUG: Break Status:	Scan In Progress
Sections             - DEBUG: This is source item number: 1 in Full
Sections             - DEBUG: Is first item? True
Sections             - DEBUG: end_on_first_item is  False
Sections             - DEBUG: In:	StartSection	Got item:	Text to be ignored
Sections             - DEBUG: Break Status:	Scan In Progress
Sections             - DEBUG: Checking Trigger: SectionBreak
Sections             - DEBUG: in section_break.check
Sections             - DEBUG: In:	Full	Got item:	StartSection A
Sections             - DEBUG: Break Status:	Scan In Progress
Sections             - DEBUG: This is source item number: 2 in Full
Sec

[{'EndSection': ['EndSection A'], 'StartSection': ['StartSection A']},
 {'EndSection': ['EndSection B'], 'StartSection': ['StartSection B']}]


![Good](../examples/Valid.png) Tree list levels with single item in deepest level and two items in each of the other two levels.

|Expected|Actual|
|-|-|
|`[['StartSection Name: A'], ['EndSection Name: A']]`|`[['StartSection Name: A'], ['EndSection Name: A']]`|
|`[['StartSection Name: B'], ['EndSection Name: B']]`|`[['StartSection Name: B'], ['EndSection Name: B']]`|

## Defining a Top Section that Breaks between two single line subsections.

#### Defining ***top_section*** 
- Contains an ending break:
    > `end_section=SectionBreak('ignored', break_offset='Before')`.

In [6]:
top_section = Section(
    section_name='Top Section',
    end_section=SectionBreak('ignored', break_offset='Before'),
    processor=[start_sub_section, end_sub_section]
    )
pprint(top_section.read(GENERIC_TEST_TEXT))

Sections             - DEBUG: Resetting source for: Top Section.
Sections             - DEBUG: Starting New Section: Top Section.
Sections             - DEBUG: Process single sub-section EndSection in: Top Section
Sections             - DEBUG: Resetting source for: EndSection.
Sections             - DEBUG: Advancing to start of EndSection.
Sections             - DEBUG: Process single sub-section StartSection in: Top Section
Sections             - DEBUG: Resetting source for: StartSection.
Sections             - DEBUG: Advancing to start of StartSection.
Sections             - DEBUG: In:	Top Section	Got item:	Text to be ignored
Sections             - DEBUG: Break Status:	Scan In Progress
Sections             - DEBUG: This is source item number: 1 in Top Section
Sections             - DEBUG: Is first item? True
Sections             - DEBUG: end_on_first_item is  False
Sections             - DEBUG: In:	StartSection	Got item:	Text to be ignored
Sections             - DEBUG: Break Status:	S

[]


<table>
    <thead><th>Expected</th><th>Actual</th></thead>
    <tr>
        <td><code></code></td>
        <td><code>
          []
        </code></td></tr>
</table>

#### Including unwanted text in between the start and end of subsection C

In [None]:
GENERIC_TEST_TEXT1 = [
    'Text to be ignored',
    'StartSection A',
    'EndSection A',
    'StartSection B',
    'EndSection B',
    'StartSection C',
    'More text to be ignored',   # 'ignored' triggers end of top section
    'EndSection C',
    'Even more text to be ignored', 
    ]

pprint(top_section.read(GENERIC_TEST_TEXT1))

![Good](../examples/Valid.png) Subsections A and B are returned Subsections C 
has an empty list instead of 'EndSection C'.

<table>
    <thead><th>Expected</th><th>Actual</th></thead>
    <tr>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
             [['StartSection B'], ['EndSection B']]<br>
             [['StartSection C'], []]<br>
          ]
        </code></td>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
            [['StartSection B'], ['EndSection B']]<br>
            [['StartSection C'], []]<br>            
          ]
        </code></td></tr>
</table>

#### Setting `keep_partial=False`
- This should be the default

In [None]:
top_section = Section(
    section_name='Top Section',
    end_section=SectionBreak('ignored', break_offset='Before'),
    processor=[start_sub_section, end_sub_section],
    keep_partial=False
    )

pprint(top_section.read(GENERIC_TEST_TEXT1))

![Good](../examples/Valid.png) Subsections A and B are returned Subsections C 
has an empty list instead of 'EndSection C'.
- 3rd section group never starts because **ignored** *Top Section* break line 
  occurs before next **EndSection**, so 2nd never finishes.


<table>
    <thead><th>Expected</th><th>Actual</th></thead>
    <tr>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
             [['StartSection B'], ['EndSection B']]<br>
             [['StartSection C'], []]<br>
          ]
        </code></td>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
            [['StartSection B'], ['EndSection B']]<br>
            [['StartSection C'], []]<br>            
          ]
        </code></td></tr>
</table>

#### Setting `keep_partial=True`

In [None]:
top_section = Section(
    section_name='Top Section',
    end_section=SectionBreak('ignored', break_offset='Before'),
    processor=[start_sub_section, end_sub_section],
    keep_partial=True
    )

pprint(top_section.read(GENERIC_TEST_TEXT1))

![Good](../examples/Valid.png) Subsections A and B are returned Subsections C 
has an empty list instead of 'EndSection C'.
- 3rd section group never starts because **ignored** *Top Section* break line 
  occurs before next **EndSection**, so 2nd never finishes.

<table>
    <thead><th>Expected</th><th>Actual</th></thead>
    <tr>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
            [['StartSection B'], ['EndSection B']]<br>
            [['StartSection C'], []]<br>
          ]
        </code></td>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
            [['StartSection B'], ['EndSection B']]<br>
            [['StartSection C'], []]<br>
          ]
        </code></td></tr>
</table>

##### Using same `keep_partial=True` setting with original source.

In [None]:
pprint(top_section.read(GENERIC_TEST_TEXT))

![Good](../examples/Valid.png) `keep_partial=True` has no effect because both 
subsections A and B are completed. 

<table>
    <thead><th>Expected</th><th>Actual</th></thead>
    <tr>
        <td><code>
          [<br>
            [['StartSection Name: A'], ['EndSection Name: A']],<br>
            [['StartSection Name: B'], ['EndSection Name: B']]<br>
          ]
        </code></td>
        <td><code>
          [<br>
            [['StartSection Name: A'], ['EndSection Name: A']],<br>
            [['StartSection Name: B'], ['EndSection Name: B']]<br>
          ]
        </code></td></tr>
</table>

##### Using same `keep_partial=True` with only `end_sub_section`.

In [None]:
top_section = Section(
    section_name='Top Section',
    end_section=SectionBreak('ignored', break_offset='Before'),
    processor=[end_sub_section],
    keep_partial=True
    )

pprint(top_section.read(GENERIC_TEST_TEXT1))

![Good](../examples/Valid.png) `keep_partial=True` has no effect because single 
subsections are processed differently. 
- Results in one less level of lists.

<table>
    <thead><th>Expected</th><th>Actual</th></thead>
    <tr>
        <td><code>
          [<br>
            ['EndSection Name: A'],<br>
            ['EndSection Name: B']<br>
          ]
        </code></td>
        <td><code>
          [<br>
            ['EndSection Name: A'],<br>
            ['EndSection Name: B']<br>
          ]
        </code></td></tr>
</table>

#### Dropping the corresponding *EndSection* for a *StartSection*.

In [None]:
GENERIC_TEST_TEXT2 = [
    'Text to be ignored',
    'StartSection A',
    'EndSection A',
    'StartSection B',  # Missing 'EndSection B',
    
    'StartSection C',
    'More text to be ignored',   # 'ignored' triggers end of top section
    'EndSection C',
    'Even more text to be ignored', 
    ]

In [None]:
top_section = Section(
    section_name='Top Section',
    end_section=SectionBreak('ignored', break_offset='Before'),
    processor=[start_sub_section, end_sub_section],
    keep_partial=True
    )

pprint(top_section.read(GENERIC_TEST_TEXT2))

![Good](../examples/Valid.png) Expecting `['StartSection B'], []`.
- 3rd section group never starts because **ignored** *Top Section* break line 
  occurs before next **EndSection**, so 2nd section never finishes.

<table>
    <thead><th>Expected</th><th>Actual</th></thead>
    <tr>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
            [['StartSection B'], []]<br>
          ]
        </code></td>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
            [['StartSection B'], []]<br>
          ]
        </code></td></tr>
</table>

In [None]:
top_section = Section(
    section_name='Top Section',
    end_section=SectionBreak('ignored', break_offset='Before'),
    processor=[start_sub_section, end_sub_section],
    keep_partial=False
    )

pprint(top_section.read(GENERIC_TEST_TEXT2))

![Good](../examples/Valid.png) Expecting partial `['StartSection B'], []`.
- 2nd section group never finishes because **ignored** *Top Section* break 
  line occurs.  Partial subsection should have been dropped.

<table>
    <thead><th>Expected</th><th>Actual</th></thead>
    <tr>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']]<br>
            [['StartSection B'], []]<br> 
          ]
        </code></td>
        <td><code>
          [<br>
            [['StartSection A'], ['EndSection A']],<br>
            [['StartSection B'], []]<br>
          ]
        </code></td></tr>
</table>

### Including unwanted text in between subsections

In [None]:
GENERIC_TEST_TEXT1 = [
    'Text to be ignored',
    'StartSection Name: A',
    'EndSection Name: A',
    'StartSection Name: B',
    'More text to be ignored', 
    'EndSection Name: B',
    'Even more text to be ignored', 
    ]

pprint(full_section.read(GENERIC_TEST_TEXT1))

![Good](../examples/Valid.png) The *to be ignored* text between subsections is dropped.

|Expected|Actual|
|-|-|
|`[['StartSection Name: A'], ['EndSection Name: A']]`|`[['StartSection Name: A'], ['EndSection Name: A']]`|
|`[['StartSection Name: B'], ['EndSection Name: B']]`|`[['StartSection Name: B'], ['EndSection Name: B']]`|