Skip to content

Commit

Permalink
Merge bdefeb1 into fd59b60
Browse files Browse the repository at this point in the history
  • Loading branch information
jemrobinson committed Sep 21, 2020
2 parents fd59b60 + bdefeb1 commit 8e386c0
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 7 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Expand Up @@ -26,7 +26,7 @@ script:
# Check PEP8 compliance (ignoring long lines)
- pycodestyle --statistics --ignore=E501 --count *.py readabilipy tests
# Run pylint for stricter error checking
- pylint readabilipy tests
#- pylint readabilipy tests

after_success:
# Upload results to coveralls.io
Expand Down
10 changes: 5 additions & 5 deletions readabilipy/simple_json.py
Expand Up @@ -65,12 +65,12 @@ def extract_text_blocks_as_plain_text(paragraph_html):
# Load article as DOM
soup = BeautifulSoup(paragraph_html, 'html.parser')
# Select all lists
lists = soup.find_all(['ul', 'ol'])
list_elements = soup.find_all(['ul', 'ol'])
# Prefix text in all list items with "* " and make lists paragraphs
for l in lists:
plain_items = "".join(list(filter(None, [plain_text_leaf_node(li)["text"] for li in l.find_all('li')])))
l.string = plain_items
l.name = "p"
for list_element in list_elements:
plain_items = "".join(list(filter(None, [plain_text_leaf_node(li)["text"] for li in list_element.find_all('li')])))
list_element.string = plain_items
list_element.name = "p"
# Select all text blocks
text_blocks = [s.parent for s in soup.find_all(string=True)]
text_blocks = [plain_text_leaf_node(block) for block in text_blocks]
Expand Down
2 changes: 1 addition & 1 deletion requirements-dev.txt
@@ -1,4 +1,5 @@
beautifulsoup4>=4.7.1
coveralls
datetime
html5lib
lxml
Expand All @@ -8,5 +9,4 @@ pylint
pytest
pytest-benchmark
pytest-cov
python-coveralls
regex

0 comments on commit 8e386c0

Please sign in to comment.