### The method of splitting long paragraphs

Suppose $A$ is an arbitrary paragraph from an article

Define $len_{word}(A)$ as the number of words in $A$.

Define $len_{word}^{*}$ as the maximum number of words in $A$ (We assume $len_{word}^{*} = 80$)

Define $len_{sentence}(A)$ as the number of sentences in $A$

#### Procedure

If $len_{word}(A) > len_{word}^{*}$ and $len_{sentences}(A) \geq 2$, $A$ will transformed into
$[A_1, A_2, ...., A_n]$ where $n = min(len_{sentence}(A), \frac{len_{word}(A)}{len_{word}^{*}})$

$A_i$ are split paragraphs that contain complete sentences and $len_{sentence}(A_i) \geq 1$.

##### For most cases (each sentence in $A$ has less than $len^*_{word}$):
For $i = 1, 2, ..., n$, $len_{word}(A_i) \leq len_{word}^{*}$

$|len_{word}(A_i) - len_{word}(A_j)| \leq len_{word}^{*}$ where $i \neq j$


#### Sample 

Article Link: https://thinkingmomsrevolution.com/ruth-snyder-r-n-auty-not-naughty-mom/

##### Input $A$ such that $len_{word}(A) = 126$, $len_{sentence}(A) = 6$
When my first son was formally diagnosed with autism, I had already known he had autism at least a year before it was confirmed. By the time my second son was diagnosed, I’d had at least two, if not three years of processing the facts and details before his autism was confirmed, medically speaking. When the neurologist gave me the news, his face had that look on it – the look people get when they are about to deliver devastating news. I was familiar with that look from working as a registered nurse for many years and watching doctors let loved ones know that death had come. I was very confused by the neurologist’s somber tone when confirming the diagnosis. This was our third official diagnosis, and I didn’t see anything wrong with autism.

##### Output

##### $A_1$ such that $len_{word}(A) = 77$, $len_{sentence}(A) = 3$
When my first son was formally diagnosed with autism, I had already known he had autism at least a year before it was confirmed. By the time my second son was diagnosed, I’d had at least two, if not three years of processing the facts and details before his autism was confirmed, medically speaking. When the neurologist gave me the news, his face had that look on it – the look people get when they are about to deliver devastating news.

##### $A_2$ such that $len_{word}(A) = 49$, $len_{sentence}(A) = 3$
I was familiar with that look from working as a registered nurse for many years and watching doctors let loved ones know that death had come. I was very confused by the neurologist’s somber tone when confirming the diagnosis. This was our third official diagnosis, and I didn’t see anything wrong with autism.



###  The method of combining short paragraphs

Suppose $L$ is a list of paragraphs by order from one article such that $L = [A_1, A_2, ... A_n]$

Define $len_{word}(A_i)$ as the number of words in $A_i$.

Define $len_{word}^{*}$ as the maximum number of words in $A_i$ (We assume $len_{word}^{*} = 80$)

Define $len^{-}_{word}$ as the minimum number of words in $A_i$ (We assume $len_{word}^{-} = 10$)

#### Procedure

$L_{new} = L$

From $i = 1$ to $i = n-1$, if $len_{word}(A_i) <= len_{word}^{-}$, remove $A_i$ in $L$ and attach $A_i$ to the begining of $A_{i+1}$ in $L_{new}$ with '\n' remaining.

The basic idea is to merge $A_i$ and $A_{i+1}$ such that $len_{word}(A_i) <= len_{word}^{-}$

The new merged paragraph is $A_j = A_i + A_{i+1}$.

If $len_{word}(A_{i+1}) <= len_{word}^{-}$, $A_j = A_i + A_{i+1} + A_{i+2}$ and iterates

##### For most cases
For $i = 1, 2, ..., n-1$, $len_{word}(A_i) \geq len_{word}^{-}$

 $len_{word}(A_n)$ might be less than $len_{word}^{-}$

#### Sample

Article Link: https://www.livingwhole.org/this-mama-isnt-scared-of-the-shmeasle-measles/

##### Input


    L = ["In 2017 there were 120 cases of measles.",
         "In 2016, there were 86.",
         "In 2015, there were 188 cases.",
         "In 2014 there were 667 cases of measles, no cases of encephalitis, and no death.",
         "In 2013 there were 189 cases of measles, no encephalitis and no death.",
         "I could go on, but you get the point. By and large, measles is an unpleasant rash with a fever but it isn’t deadly. The clinical definition doesn’t support that and neither do the facts. By comparison, as of March 1, 2012 there were 842 serious injuries following the MMR vaccine and 140 deaths. Between 1990 and 2014, were more than 6,058 serious adverse events reported to the Vaccine Adverse Events Reporting System (VAERS). That’s significant when you consider that only 1-10% of adverse events are actually reported to this system. "]

##### Output
$L_{new} = $

['In 2017 there were 120 cases of measles.\n In 2016, there were 86.\n In 2015, there were 188 cases.\n In 2014 there were 667 cases of measles, no cases of encephalitis, and no death.', 

'In 2013 there were 189 cases of measles, no encephalitis and no death.', 

'I could go on, but you get the point. By and large, measles is an unpleasant rash with a fever but it isn’t deadly. The clinical definition doesn’t support that and neither do the facts. By comparison, as of March 1, 2012 there were 842 serious injuries following the MMR vaccine and 140 deaths. Between 1990 and 2014, were more than 6,058 serious adverse events reported to the Vaccine Adverse Events Reporting System (VAERS). That’s significant when you consider that only 1-10% of adverse events are actually reported to this system. ']
