Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bloomberg Parse Error #360

Closed
3aboooody56 opened this issue Mar 10, 2024 · 8 comments · Fixed by #362
Closed

Bloomberg Parse Error #360

3aboooody56 opened this issue Mar 10, 2024 · 8 comments · Fixed by #362
Labels
module malfunction report about malfunctioning module

Comments

@3aboooody56
Copy link

Hello,

The Bloomberg module has been returning parse errors lately. From what I found, there are two issues: One has to do with the encoding/decoding of the HTML page, and the other is a parsing error.

I have found that removing line 42 in the module solves the encoding/decoding issue: 'Accept-Encoding' => 'br',

As for the parsing issue, changing line 58 to the following solves it:
my $date = $tree->look_down(_tag=>'span', 'class'=>qr/^marketLastUpdate_exchangeDelay___PZEn/)->right();

@bpschuck
Copy link
Contributor

@3aboooody56
Thanks for the info.
It would be appreciated if you could create a pull request. Please note that we ask to update "Changes" and "Modules-README.yml" to save us some housecleaning time when preparing to push a release to CPAN. As always, check for a clean execution of the test script and update any symbols that are no longer traded.

@3aboooody56
Copy link
Author

3aboooody56 commented Mar 17, 2024

@bpschuck

The changes I mentioned earlier partially fix the Bloomberg module. After testing the changes, the module fails to retrieve the prices for US / USD-based stocks (works for other stocks). The HTML document/webpage is different for US stocks - at least the HTML elements that contain the stock price are different. I tried fixing this by having two separate look_down method calls for the two cases.

But I am currently running into another issue with this other HTML webpage for US stocks where the HTML page content that is pulled by the module (Opened as a text file) is missing the price or at least where it should be:

<div class="currentPrice_currentPriceContainer__nC8vw">
   <div class="priceDelta_price__Acvmw">
      <span class="priceDelta_black__KE45q"></span>

When opening that same HTML page content that was pulled by the module in a browser, the price shows up in the HTML content as follows:

<div class="currentPrice_currentPriceContainer__nC8vw">
   <div class="priceDelta_price__Acvmw">
      <span class="priceDelta_black__KE45q">141.18</span>

This was the case for GOOGL:US.

Would appreciate any insight as to why this happens and what can be done about it. Thanks.

@bpschuck
Copy link
Contributor

@3aboooody56
Sorry, but I currently don't have the time to troubleshoot the module. Apparently Bloomberg has made changes to the structure of the web pages that break the parsing logic in the module. There is a similar issue with the Fidelity.pm that was discovered 6 months ago.

@bpschuck bpschuck added the module malfunction report about malfunctioning module label Mar 17, 2024
@pghmcfc
Copy link
Contributor

pghmcfc commented Mar 18, 2024

@bpschuck, @3aboooody56

Would appreciate any insight as to why this happens and what can be done about it. Thanks.

Looks like it fills in the price using some javascript. Came across the exact same issue when looking at this yesterday. Fortunately, there is a giant blob of JSON embedded as a script in the page, from which it is easy to extract the data. PR incoming soon.

pghmcfc added a commit to pghmcfc/finance-quote that referenced this issue Mar 18, 2024
This should fix finance-quote#360

At first it just looked like some of the HTML structure had changed,
requiring slightly different tag searches. However, at least for US
stocks, the price data now seems to be filled in using a script, so
the HTML as seen by F::Q is not the same as what can be saved from
a browser and the price may be missing.

Fortunately, the data is available in a nice JSON structure that is
embedded in a <script> tag, so it's possible to retrieve all of the
required data by extracting and parsing that JSON blob.
@3aboooody56
Copy link
Author

Thanks for the fix.

As a GnuCash user, how can I have it use a modified version of F: :Q instead of having to wait for the next release?

@pghmcfc
Copy link
Contributor

pghmcfc commented Mar 20, 2024

@3aboooody56

  • Go to the pull request at Fix Bloomberg module, now extracting data from JSON blob embedded in the HTML #362
  • Select the "Files changed" tab
  • In the code tree view on the left hand side, click on the file of interest, "Bloomberg.pm"
  • In the main part of the window, from the menu ("...") for that file, select "View file"
  • When the modified file is shown, click the "Download raw file" option (to the right of the "Raw" button)
  • Replace the file "Bloomberg.pm" in your current installation with the one you just downloaded

That should do it.

@3aboooody56
Copy link
Author

Thanks, that worked. Took me a bit to find the version of Perl and F::Q that GnuCash was using.

@bpschuck
Copy link
Contributor

Thanks, that worked. Took me a bit to find the version of Perl and F::Q that GnuCash was using.

@3aboooody56

That is typically why I don't typically even attempt to answer questions about installing individual modules. For those with experience in Perl it is simple. But it is possible to find multiple instances of F::Q installed.

pghmcfc added a commit to pghmcfc/finance-quote that referenced this issue Mar 24, 2024
This should fix finance-quote#360

At first it just looked like some of the HTML structure had changed,
requiring slightly different tag searches. However, at least for US
stocks, the price data now seems to be filled in using a script, so
the HTML as seen by F::Q is not the same as what can be saved from
a browser and the price may be missing.

Fortunately, the data is available in a nice JSON structure that is
embedded in a <script> tag, so it's possible to retrieve all of the
required data by extracting and parsing that JSON blob.
bpschuck added a commit that referenced this issue Apr 16, 2024
	* Removed not working modules. Issues #346, #366, and #368.
		Fidelity.pm, Cdbfundlibrary.com, Fundata.pm, and Fool.pm.
	* YahooJSON.pm - Added code to retrieve cookies and a "crumb" required
		to continue to utilize the v11 API. Issue #369.
		The YahooJSON.pm currency module was changed to use the v8 API.
	* Added initial version of CONTRIBUTING.pod that metacpan.org utilizes.
		It will completely replace the Hacker's Guide in the future.
	* Bloomberg.pm - Changed module to extract data from JSON structure embedded within the HTML - Issue #360
	* NSEIndia.pm - Eliminated need to use temp folders by storing file data from URL into a variable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module malfunction report about malfunctioning module
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants