This script collects articles from Wall Street Journal and returns it in dict format.
nsloader is not registered on the pypi. You have to download from github directly.
$ python -m pip install nsloader
nsloader is tested by Python 3.10
.
Additionaly, you have to execute install playwright
in your execution environment.
To load the Wall Street Journal articles and parse to dictionay format.
NOTE: You have to set 2 enviroment valiables name as WSJ_USERNAME
and WSJ_PASSWORD
before execution.
>>> from nsloader import wsj
>>> article = wsj.Article()
>>> article.load('https://www.wsj.com/articles/...')
>>> print(article.to_dict())
{"url": "https://www.wsj.com/articles/...", "title": "The Fed ...", "sub_title": "As expected ...", ... }