Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add deadline.com custom parser #383

Merged
merged 6 commits into from Apr 24, 2019

Conversation

kik0220
Copy link
Contributor

@kik0220 kik0220 commented Apr 15, 2019

add deadline.com custom parser

@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: feat: add deadline.com custom parser

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Donald Trump Advises Boeing, Tweeting “But What The Hell Do I Know?”; Twitter Answers – Deadline",
  "content": "<div><div class=\"pmc-a-grid-item pmc-a-span2@desktop u-max-width-100p\">\n\t\t\t\t<figure class=\"c-figure u-border-b-1 u-border-color-grey-medium-light\">\n\t<img width=\"450\" height=\"253\" src=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1\" class=\"c-figure__image\" alt=\"Donald Trump\" srcset=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1000&amp;h=563&amp;crop=1 1000w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=910&amp;h=511&amp;crop=1 910w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=681&amp;h=383&amp;crop=1 681w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1 450w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=250&amp;h=140&amp;crop=1 250w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=225&amp;h=225&amp;crop=1 225w\" sizes=\"(min-width: 87.5rem) 1000px, (min-width: 78.75rem) 681px, (min-width: 48rem) 450px, (max-width: 48rem) 250px\">\n\t\t<figcaption class=\"c-figure__caption u-flex u-flex-direction-column pmc-u-font-family-helvetica pmc-u-padding-tb-025\">\n\n\t\t\t\n\t\t\t\t\t\t\t<span class=\"pmc-u-color-grey-medium-dark pmc-u-font-weight-light pmc-u-font-size-12\">\n\t\t\t\t\tAndrew Harnik/AP/Shutterstock\t\t\t\t</span>\n\t\t\t\n\t\t</figcaption>\n\n\t</figure>\n\n\t\t\t\t<div class=\"a-content pmc-u-line-height-copy pmc-u-font-family-georgia pmc-u-font-size-16 pmc-u-font-size-18@desktop\">\n\t\t\t\t\t<p><a href=\"https://deadline.com/tag/twitter/\" id=\"auto-tag_twitter\">Twitter</a> erupted Monday morning when President <a href=\"https://deadline.com/tag/donald-trump/\" id=\"auto-tag_donald-trump\">Donald Trump</a> shared his branding expertise with <a href=\"https://deadline.com/tag/boeing/\" id=\"auto-tag_boeing\">Boeing</a>, after American Airlines announced it was cancelling Boeing&#x2019;s Max 737 flights through mid-August. That after Southwest Airlines, the largest operator of Boeing jets, canceled its Max flights through 5 August.</p>\n\n<p>The U.S. was one of the last countries to ground the plane last month after a second deadly crash in Ethiopia, following by months a crash in Indonesia. Since the second Ethiopian Airlines crash, the stock has plunged 12% and cutting back production, lost 24$ from its market cap.</p>\n<p>&#x201C;What do I know about branding, maybe nothing (but I did become President!),&#x201D; Trump simpered in a morning tweet.</p>\n\n<p>&#x201C;[B]ut if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name,&#x201D; he advised.</p>\n<p>&#x201C;No product has suffered like this one. But again, what the hell do I know?&#x201D; Trump said which, given his history with defunct Trump Airlines and other projects, triggered shooting-fish-in-barrel responses on Twitter, predictably taking &#x201C;Trump Airlines&#x201D; to Top-10 trending status worldwide. Among the responses:</p>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>Trump Airlines. Now there&apos;s a brand. <a href=\"https://twitter.com/realDonaldTrump?ref_src=twsrc%5Etfw\">@realDonaldTrump</a> <a href=\"https://t.co/xdcgcjohW4\">https://t.co/xdcgcjohW4</a></p>\n<p>&#x2014; Brian J. Karem (@BrianKarem) <a href=\"https://twitter.com/BrianKarem/status/1117767646961963009?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>Before Trump tries to act as though he knows how to fix Boeing, let&#x2019;s all review what happened to Trump Airlines, also known as &#x2018;Trump Shuttle&#x2019;.<a href=\"https://twitter.com/HillReporter?ref_src=twsrc%5Etfw\">@hillreporter</a><a href=\"https://t.co/EFacdLsB9m\">https://t.co/EFacdLsB9m</a></p>\n<p>&#x2014; Ed Krassenstein (@EdKrassen) <a href=\"https://twitter.com/EdKrassen/status/1117773751217795078?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n<p>Trump&#x2019;s tweet:</p>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>What do I know about branding, maybe nothing (but I did become President!), but if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name.<br>No product has suffered like this one. But again, what the hell do I know?</p>\n<p>&#x2014; Donald J. Trump (@realDonaldTrump) <a href=\"https://twitter.com/realDonaldTrump/status/1117736685721223168?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n\t\t\t\t</div>\n\n\t\t\t\t<p class=\"pmc-u-text-align-center\">Subscribe to <a href=\"https://pages.email.deadline.com/signup\" class=\"pmc-u-font-weight-bold\">Deadline Breaking News Alerts</a> and keep your inbox happy.</p><div class=\"article-tags u-flex u-align-items-center u-flex-direction-column@mobile-max u-justify-content-center pmc-u-margin-b-1\">\n\t<span class=\"c-label  pmc-u-font-size-16 pmc-u-margin-r-025\">\n\n\tRead More About:\n</span>\n\t<nav class=\"o-nav o-nav--horizontal \">\n\n\t\n\t<ul class=\"o-nav__list u-justify-content-center@mobile-max pmc-u-margin-a-00 u-flex-wrap-wrap\">\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/boeing/\">\n\tBoeing</a>\n\t\t\t</li>\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/donald-trump/\">\n\tDonald Trump</a>\n\t\t\t</li>\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/twitter/\">\n\tTwitter</a>\n\t\t\t</li>\n\t\t\t</ul>\n</nav>\n</div>\n\n<div class=\"\">\n\t<div class=\"widget widget_pmc_outbrain_widget\"><div class=\"outbrain-widget\">\n\t\t\t<div class=\"OUTBRAIN\"></div>\n\t</div>\n</div></div>\n\n\t\t\t\t<div class=\"pmc-u-margin-tb-2\">\n\t\t\t\t\t\t<div class=\"admz\" id=\"adm-after-article\">\n\t\t\t\t\t<div class=\"adma google-publisher\">\n\t\t\t\t\n<div class=\"pmc-adm-goog-pub-div ad-text\">\n\t<div id=\"div-gpt-dl-ros-620x250-uid4\" class=\"ad-rotatable adw-620 adh-250\"></div>\n\t</div>\n\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t\t</div>\n\n\t\t\t\t\n\n<div id=\"comments-loaded\"></div>\n\t\t\t\t<div class=\"pmc-u-margin-tb-2\">\n\t\t\t\t\t\t<div class=\"admz\" id=\"adm-after-comments\">\n\t\t\t\t\t<div class=\"adma google-publisher\">\n\t\t\t\t\n<div class=\"pmc-adm-goog-pub-div \">\n\t<div id=\"div-gpt-dl-ros-620x251-uid5\" class=\"ad-rotatable adw-620 adh-251\"></div>\n\t</div>\n\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t</div></div>",
  "author": "Lisa de Moraes",
  "date_published": "2019-04-15T06:18:00.000Z",
  "lead_image_url": "https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1024",
  "dek": null,
  "next_page_url": null,
  "url": "https://deadline.com/2019/04/donald-trump-boeing-max-737-rebrand-advice-twitter-1202595880/",
  "domain": "deadline.com",
  "word_count": 257,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

1 failed test 😱

DeadlineComExtractor initial test case returns the date_published

See what went wrong
AssertionError [ERR_ASSERTION]: '2019-04-15T06:18:00.000Z' == '2019-04-14T21:18:00.000Z'
    at Object.equal (/home/circleci/project/src/extractors/custom/deadline.com/index.test.js:65:14)
    at tryCatch (/home/circleci/project/node_modules/regenerator-runtime/runtime.js:62:40)
    at Generator.invoke [as _invoke] (/home/circleci/project/node_modules/regenerator-runtime/runtime.js:288:22)
    at Generator.prototype.(anonymous function) [as next] (/home/circleci/project/node_modules/regenerator-runtime/runtime.js:114:21)
    at asyncGeneratorStep (/home/circleci/project/src/extractors/custom/deadline.com/index.test.js:17:103)
    at _next (/home/circleci/project/src/extractors/custom/deadline.com/index.test.js:19:194)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7)


@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: fix: timezone

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Donald Trump Advises Boeing, Tweeting “But What The Hell Do I Know?”; Twitter Answers – Deadline",
  "content": "<div><div class=\"pmc-a-grid-item pmc-a-span2@desktop u-max-width-100p\">\n\t\t\t\t<figure class=\"c-figure u-border-b-1 u-border-color-grey-medium-light\">\n\t<img width=\"450\" height=\"253\" src=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1\" class=\"c-figure__image\" alt=\"Donald Trump\" srcset=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1000&amp;h=563&amp;crop=1 1000w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=910&amp;h=511&amp;crop=1 910w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=681&amp;h=383&amp;crop=1 681w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1 450w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=250&amp;h=140&amp;crop=1 250w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=225&amp;h=225&amp;crop=1 225w\" sizes=\"(min-width: 87.5rem) 1000px, (min-width: 78.75rem) 681px, (min-width: 48rem) 450px, (max-width: 48rem) 250px\">\n\t\t<figcaption class=\"c-figure__caption u-flex u-flex-direction-column pmc-u-font-family-helvetica pmc-u-padding-tb-025\">\n\n\t\t\t\n\t\t\t\t\t\t\t<span class=\"pmc-u-color-grey-medium-dark pmc-u-font-weight-light pmc-u-font-size-12\">\n\t\t\t\t\tAndrew Harnik/AP/Shutterstock\t\t\t\t</span>\n\t\t\t\n\t\t</figcaption>\n\n\t</figure>\n\n\t\t\t\t<div class=\"a-content pmc-u-line-height-copy pmc-u-font-family-georgia pmc-u-font-size-16 pmc-u-font-size-18@desktop\">\n\t\t\t\t\t<p><a href=\"https://deadline.com/tag/twitter/\" id=\"auto-tag_twitter\">Twitter</a> erupted Monday morning when President <a href=\"https://deadline.com/tag/donald-trump/\" id=\"auto-tag_donald-trump\">Donald Trump</a> shared his branding expertise with <a href=\"https://deadline.com/tag/boeing/\" id=\"auto-tag_boeing\">Boeing</a>, after American Airlines announced it was cancelling Boeing&#x2019;s Max 737 flights through mid-August. That after Southwest Airlines, the largest operator of Boeing jets, canceled its Max flights through 5 August.</p>\n\n<p>The U.S. was one of the last countries to ground the plane last month after a second deadly crash in Ethiopia, following by months a crash in Indonesia. Since the second Ethiopian Airlines crash, the stock has plunged 12% and cutting back production, lost 24$ from its market cap.</p>\n<p>&#x201C;What do I know about branding, maybe nothing (but I did become President!),&#x201D; Trump simpered in a morning tweet.</p>\n\n<p>&#x201C;[B]ut if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name,&#x201D; he advised.</p>\n<p>&#x201C;No product has suffered like this one. But again, what the hell do I know?&#x201D; Trump said which, given his history with defunct Trump Airlines and other projects, triggered shooting-fish-in-barrel responses on Twitter, predictably taking &#x201C;Trump Airlines&#x201D; to Top-10 trending status worldwide. Among the responses:</p>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>Trump Airlines. Now there&apos;s a brand. <a href=\"https://twitter.com/realDonaldTrump?ref_src=twsrc%5Etfw\">@realDonaldTrump</a> <a href=\"https://t.co/xdcgcjohW4\">https://t.co/xdcgcjohW4</a></p>\n<p>&#x2014; Brian J. Karem (@BrianKarem) <a href=\"https://twitter.com/BrianKarem/status/1117767646961963009?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>Before Trump tries to act as though he knows how to fix Boeing, let&#x2019;s all review what happened to Trump Airlines, also known as &#x2018;Trump Shuttle&#x2019;.<a href=\"https://twitter.com/HillReporter?ref_src=twsrc%5Etfw\">@hillreporter</a><a href=\"https://t.co/EFacdLsB9m\">https://t.co/EFacdLsB9m</a></p>\n<p>&#x2014; Ed Krassenstein (@EdKrassen) <a href=\"https://twitter.com/EdKrassen/status/1117773751217795078?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n<p>Trump&#x2019;s tweet:</p>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>What do I know about branding, maybe nothing (but I did become President!), but if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name.<br>No product has suffered like this one. But again, what the hell do I know?</p>\n<p>&#x2014; Donald J. Trump (@realDonaldTrump) <a href=\"https://twitter.com/realDonaldTrump/status/1117736685721223168?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n\t\t\t\t</div>\n\n\t\t\t\t<p class=\"pmc-u-text-align-center\">Subscribe to <a href=\"https://pages.email.deadline.com/signup\" class=\"pmc-u-font-weight-bold\">Deadline Breaking News Alerts</a> and keep your inbox happy.</p><div class=\"article-tags u-flex u-align-items-center u-flex-direction-column@mobile-max u-justify-content-center pmc-u-margin-b-1\">\n\t<span class=\"c-label  pmc-u-font-size-16 pmc-u-margin-r-025\">\n\n\tRead More About:\n</span>\n\t<nav class=\"o-nav o-nav--horizontal \">\n\n\t\n\t<ul class=\"o-nav__list u-justify-content-center@mobile-max pmc-u-margin-a-00 u-flex-wrap-wrap\">\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/boeing/\">\n\tBoeing</a>\n\t\t\t</li>\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/donald-trump/\">\n\tDonald Trump</a>\n\t\t\t</li>\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/twitter/\">\n\tTwitter</a>\n\t\t\t</li>\n\t\t\t</ul>\n</nav>\n</div>\n\n<div class=\"\">\n\t<div class=\"widget widget_pmc_outbrain_widget\"><div class=\"outbrain-widget\">\n\t\t\t<div class=\"OUTBRAIN\"></div>\n\t</div>\n</div></div>\n\n\t\t\t\t<div class=\"pmc-u-margin-tb-2\">\n\t\t\t\t\t\t<div class=\"admz\" id=\"adm-after-article\">\n\t\t\t\t\t<div class=\"adma google-publisher\">\n\t\t\t\t\n<div class=\"pmc-adm-goog-pub-div ad-text\">\n\t<div id=\"div-gpt-dl-ros-620x250-uid4\" class=\"ad-rotatable adw-620 adh-250\"></div>\n\t</div>\n\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t\t</div>\n\n\t\t\t\t\n\n<div id=\"comments-loaded\"></div>\n\t\t\t\t<div class=\"pmc-u-margin-tb-2\">\n\t\t\t\t\t\t<div class=\"admz\" id=\"adm-after-comments\">\n\t\t\t\t\t<div class=\"adma google-publisher\">\n\t\t\t\t\n<div class=\"pmc-adm-goog-pub-div \">\n\t<div id=\"div-gpt-dl-ros-620x251-uid5\" class=\"ad-rotatable adw-620 adh-251\"></div>\n\t</div>\n\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t</div></div>",
  "author": "Lisa de Moraes",
  "date_published": "2019-04-15T06:18:00.000Z",
  "lead_image_url": "https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1024",
  "dek": null,
  "next_page_url": null,
  "url": "https://deadline.com/2019/04/donald-trump-boeing-max-737-rebrand-advice-twitter-1202595880/",
  "domain": "deadline.com",
  "word_count": 257,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

1 failed test 😱

DeadlineComExtractor initial test case returns the date_published

See what went wrong
AssertionError [ERR_ASSERTION]: '2019-04-15T06:18:00.000Z' == '2019-04-14T21:18:00.000Z'
    at Object.equal (/home/circleci/project/src/extractors/custom/deadline.com/index.test.js:65:14)
    at tryCatch (/home/circleci/project/node_modules/regenerator-runtime/runtime.js:62:40)
    at Generator.invoke [as _invoke] (/home/circleci/project/node_modules/regenerator-runtime/runtime.js:288:22)
    at Generator.prototype.(anonymous function) [as next] (/home/circleci/project/node_modules/regenerator-runtime/runtime.js:114:21)
    at asyncGeneratorStep (/home/circleci/project/src/extractors/custom/deadline.com/index.test.js:17:103)
    at _next (/home/circleci/project/src/extractors/custom/deadline.com/index.test.js:19:194)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7)


@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: fix: date_published selectors

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Donald Trump Advises Boeing, Tweeting “But What The Hell Do I Know?”; Twitter Answers – Deadline",
  "content": "<div><div class=\"pmc-a-grid-item pmc-a-span2@desktop u-max-width-100p\">\n\t\t\t\t<figure class=\"c-figure u-border-b-1 u-border-color-grey-medium-light\">\n\t<img width=\"450\" height=\"253\" src=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1\" class=\"c-figure__image\" alt=\"Donald Trump\" srcset=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1000&amp;h=563&amp;crop=1 1000w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=910&amp;h=511&amp;crop=1 910w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=681&amp;h=383&amp;crop=1 681w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1 450w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=250&amp;h=140&amp;crop=1 250w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=225&amp;h=225&amp;crop=1 225w\" sizes=\"(min-width: 87.5rem) 1000px, (min-width: 78.75rem) 681px, (min-width: 48rem) 450px, (max-width: 48rem) 250px\">\n\t\t<figcaption class=\"c-figure__caption u-flex u-flex-direction-column pmc-u-font-family-helvetica pmc-u-padding-tb-025\">\n\n\t\t\t\n\t\t\t\t\t\t\t<span class=\"pmc-u-color-grey-medium-dark pmc-u-font-weight-light pmc-u-font-size-12\">\n\t\t\t\t\tAndrew Harnik/AP/Shutterstock\t\t\t\t</span>\n\t\t\t\n\t\t</figcaption>\n\n\t</figure>\n\n\t\t\t\t<div class=\"a-content pmc-u-line-height-copy pmc-u-font-family-georgia pmc-u-font-size-16 pmc-u-font-size-18@desktop\">\n\t\t\t\t\t<p><a href=\"https://deadline.com/tag/twitter/\" id=\"auto-tag_twitter\">Twitter</a> erupted Monday morning when President <a href=\"https://deadline.com/tag/donald-trump/\" id=\"auto-tag_donald-trump\">Donald Trump</a> shared his branding expertise with <a href=\"https://deadline.com/tag/boeing/\" id=\"auto-tag_boeing\">Boeing</a>, after American Airlines announced it was cancelling Boeing&#x2019;s Max 737 flights through mid-August. That after Southwest Airlines, the largest operator of Boeing jets, canceled its Max flights through 5 August.</p>\n\n<p>The U.S. was one of the last countries to ground the plane last month after a second deadly crash in Ethiopia, following by months a crash in Indonesia. Since the second Ethiopian Airlines crash, the stock has plunged 12% and cutting back production, lost 24$ from its market cap.</p>\n<p>&#x201C;What do I know about branding, maybe nothing (but I did become President!),&#x201D; Trump simpered in a morning tweet.</p>\n\n<p>&#x201C;[B]ut if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name,&#x201D; he advised.</p>\n<p>&#x201C;No product has suffered like this one. But again, what the hell do I know?&#x201D; Trump said which, given his history with defunct Trump Airlines and other projects, triggered shooting-fish-in-barrel responses on Twitter, predictably taking &#x201C;Trump Airlines&#x201D; to Top-10 trending status worldwide. Among the responses:</p>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>Trump Airlines. Now there&apos;s a brand. <a href=\"https://twitter.com/realDonaldTrump?ref_src=twsrc%5Etfw\">@realDonaldTrump</a> <a href=\"https://t.co/xdcgcjohW4\">https://t.co/xdcgcjohW4</a></p>\n<p>&#x2014; Brian J. Karem (@BrianKarem) <a href=\"https://twitter.com/BrianKarem/status/1117767646961963009?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>Before Trump tries to act as though he knows how to fix Boeing, let&#x2019;s all review what happened to Trump Airlines, also known as &#x2018;Trump Shuttle&#x2019;.<a href=\"https://twitter.com/HillReporter?ref_src=twsrc%5Etfw\">@hillreporter</a><a href=\"https://t.co/EFacdLsB9m\">https://t.co/EFacdLsB9m</a></p>\n<p>&#x2014; Ed Krassenstein (@EdKrassen) <a href=\"https://twitter.com/EdKrassen/status/1117773751217795078?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n<p>Trump&#x2019;s tweet:</p>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>What do I know about branding, maybe nothing (but I did become President!), but if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name.<br>No product has suffered like this one. But again, what the hell do I know?</p>\n<p>&#x2014; Donald J. Trump (@realDonaldTrump) <a href=\"https://twitter.com/realDonaldTrump/status/1117736685721223168?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n\t\t\t\t</div>\n\n\t\t\t\t<p class=\"pmc-u-text-align-center\">Subscribe to <a href=\"https://pages.email.deadline.com/signup\" class=\"pmc-u-font-weight-bold\">Deadline Breaking News Alerts</a> and keep your inbox happy.</p><div class=\"article-tags u-flex u-align-items-center u-flex-direction-column@mobile-max u-justify-content-center pmc-u-margin-b-1\">\n\t<span class=\"c-label  pmc-u-font-size-16 pmc-u-margin-r-025\">\n\n\tRead More About:\n</span>\n\t<nav class=\"o-nav o-nav--horizontal \">\n\n\t\n\t<ul class=\"o-nav__list u-justify-content-center@mobile-max pmc-u-margin-a-00 u-flex-wrap-wrap\">\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/boeing/\">\n\tBoeing</a>\n\t\t\t</li>\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/donald-trump/\">\n\tDonald Trump</a>\n\t\t\t</li>\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/twitter/\">\n\tTwitter</a>\n\t\t\t</li>\n\t\t\t</ul>\n</nav>\n</div>\n\n<div class=\"\">\n\t<div class=\"widget widget_pmc_outbrain_widget\"><div class=\"outbrain-widget\">\n\t\t\t<div class=\"OUTBRAIN\"></div>\n\t</div>\n</div></div>\n\n\t\t\t\t<div class=\"pmc-u-margin-tb-2\">\n\t\t\t\t\t\t<div class=\"admz\" id=\"adm-after-article\">\n\t\t\t\t\t<div class=\"adma google-publisher\">\n\t\t\t\t\n<div class=\"pmc-adm-goog-pub-div ad-text\">\n\t<div id=\"div-gpt-dl-ros-620x250-uid4\" class=\"ad-rotatable adw-620 adh-250\"></div>\n\t</div>\n\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t\t</div>\n\n\t\t\t\t\n\n<div id=\"comments-loaded\"></div>\n\t\t\t\t<div class=\"pmc-u-margin-tb-2\">\n\t\t\t\t\t\t<div class=\"admz\" id=\"adm-after-comments\">\n\t\t\t\t\t<div class=\"adma google-publisher\">\n\t\t\t\t\n<div class=\"pmc-adm-goog-pub-div \">\n\t<div id=\"div-gpt-dl-ros-620x251-uid5\" class=\"ad-rotatable adw-620 adh-251\"></div>\n\t</div>\n\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t</div></div>",
  "author": "Lisa de Moraes",
  "date_published": "2019-04-15T13:18:34.000Z",
  "lead_image_url": "https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1024",
  "dek": null,
  "next_page_url": null,
  "url": "https://deadline.com/2019/04/donald-trump-boeing-max-737-rebrand-advice-twitter-1202595880/",
  "domain": "deadline.com",
  "word_count": 257,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

✅ All tests passed

src/extractors/custom/deadline.com/index.js Outdated Show resolved Hide resolved
src/extractors/custom/deadline.com/index.js Outdated Show resolved Hide resolved
@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: fix: title and author selector

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Donald Trump Advises Boeing To Rebrand Max 737, Tweeting “But What The Hell Do I Know?”; Twitter Answers",
  "content": "<div><div class=\"pmc-a-grid-item pmc-a-span2@desktop u-max-width-100p\">\n\t\t\t\t<figure class=\"c-figure u-border-b-1 u-border-color-grey-medium-light\">\n\t<img width=\"450\" height=\"253\" src=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1\" class=\"c-figure__image\" alt=\"Donald Trump\" srcset=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1000&amp;h=563&amp;crop=1 1000w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=910&amp;h=511&amp;crop=1 910w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=681&amp;h=383&amp;crop=1 681w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1 450w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=250&amp;h=140&amp;crop=1 250w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=225&amp;h=225&amp;crop=1 225w\" sizes=\"(min-width: 87.5rem) 1000px, (min-width: 78.75rem) 681px, (min-width: 48rem) 450px, (max-width: 48rem) 250px\">\n\t\t<figcaption class=\"c-figure__caption u-flex u-flex-direction-column pmc-u-font-family-helvetica pmc-u-padding-tb-025\">\n\n\t\t\t\n\t\t\t\t\t\t\t<span class=\"pmc-u-color-grey-medium-dark pmc-u-font-weight-light pmc-u-font-size-12\">\n\t\t\t\t\tAndrew Harnik/AP/Shutterstock\t\t\t\t</span>\n\t\t\t\n\t\t</figcaption>\n\n\t</figure>\n\n\t\t\t\t<div class=\"a-content pmc-u-line-height-copy pmc-u-font-family-georgia pmc-u-font-size-16 pmc-u-font-size-18@desktop\">\n\t\t\t\t\t<p><a href=\"https://deadline.com/tag/twitter/\" id=\"auto-tag_twitter\">Twitter</a> erupted Monday morning when President <a href=\"https://deadline.com/tag/donald-trump/\" id=\"auto-tag_donald-trump\">Donald Trump</a> shared his branding expertise with <a href=\"https://deadline.com/tag/boeing/\" id=\"auto-tag_boeing\">Boeing</a>, after American Airlines announced it was cancelling Boeing&#x2019;s Max 737 flights through mid-August. That after Southwest Airlines, the largest operator of Boeing jets, canceled its Max flights through 5 August.</p>\n\n<p>The U.S. was one of the last countries to ground the plane last month after a second deadly crash in Ethiopia, following by months a crash in Indonesia. Since the second Ethiopian Airlines crash, the stock has plunged 12% and cutting back production, lost 24$ from its market cap.</p>\n<p>&#x201C;What do I know about branding, maybe nothing (but I did become President!),&#x201D; Trump simpered in a morning tweet.</p>\n\n<p>&#x201C;[B]ut if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name,&#x201D; he advised.</p>\n<p>&#x201C;No product has suffered like this one. But again, what the hell do I know?&#x201D; Trump said which, given his history with defunct Trump Airlines and other projects, triggered shooting-fish-in-barrel responses on Twitter, predictably taking &#x201C;Trump Airlines&#x201D; to Top-10 trending status worldwide. Among the responses:</p>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>Trump Airlines. Now there&apos;s a brand. <a href=\"https://twitter.com/realDonaldTrump?ref_src=twsrc%5Etfw\">@realDonaldTrump</a> <a href=\"https://t.co/xdcgcjohW4\">https://t.co/xdcgcjohW4</a></p>\n<p>&#x2014; Brian J. Karem (@BrianKarem) <a href=\"https://twitter.com/BrianKarem/status/1117767646961963009?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>Before Trump tries to act as though he knows how to fix Boeing, let&#x2019;s all review what happened to Trump Airlines, also known as &#x2018;Trump Shuttle&#x2019;.<a href=\"https://twitter.com/HillReporter?ref_src=twsrc%5Etfw\">@hillreporter</a><a href=\"https://t.co/EFacdLsB9m\">https://t.co/EFacdLsB9m</a></p>\n<p>&#x2014; Ed Krassenstein (@EdKrassen) <a href=\"https://twitter.com/EdKrassen/status/1117773751217795078?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n<p>Trump&#x2019;s tweet:</p>\n<div class=\"embed-twitter\">\n<blockquote class=\"twitter-tweet\">\n<p>What do I know about branding, maybe nothing (but I did become President!), but if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name.<br>No product has suffered like this one. But again, what the hell do I know?</p>\n<p>&#x2014; Donald J. Trump (@realDonaldTrump) <a href=\"https://twitter.com/realDonaldTrump/status/1117736685721223168?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n</div>\n\t\t\t\t</div>\n\n\t\t\t\t<p class=\"pmc-u-text-align-center\">Subscribe to <a href=\"https://pages.email.deadline.com/signup\" class=\"pmc-u-font-weight-bold\">Deadline Breaking News Alerts</a> and keep your inbox happy.</p><div class=\"article-tags u-flex u-align-items-center u-flex-direction-column@mobile-max u-justify-content-center pmc-u-margin-b-1\">\n\t<span class=\"c-label  pmc-u-font-size-16 pmc-u-margin-r-025\">\n\n\tRead More About:\n</span>\n\t<nav class=\"o-nav o-nav--horizontal \">\n\n\t\n\t<ul class=\"o-nav__list u-justify-content-center@mobile-max pmc-u-margin-a-00 u-flex-wrap-wrap\">\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/boeing/\">\n\tBoeing</a>\n\t\t\t</li>\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/donald-trump/\">\n\tDonald Trump</a>\n\t\t\t</li>\n\t\t\t\t\t<li class=\"o-nav__list-item u-text-transform-uppercase pmc-u-font-size-12 a-icon-before a-icon-forward-slash pmc-u-margin-l-050 pmc-u-margin-b-050@mobile-max\">\n\t\t\t\t<a class=\"c-nav-link  \" href=\"https://deadline.com/tag/twitter/\">\n\tTwitter</a>\n\t\t\t</li>\n\t\t\t</ul>\n</nav>\n</div>\n\n<div class=\"\">\n\t<div class=\"widget widget_pmc_outbrain_widget\"><div class=\"outbrain-widget\">\n\t\t\t<div class=\"OUTBRAIN\"></div>\n\t</div>\n</div></div>\n\n\t\t\t\t<div class=\"pmc-u-margin-tb-2\">\n\t\t\t\t\t\t<div class=\"admz\" id=\"adm-after-article\">\n\t\t\t\t\t<div class=\"adma google-publisher\">\n\t\t\t\t\n<div class=\"pmc-adm-goog-pub-div ad-text\">\n\t<div id=\"div-gpt-dl-ros-620x250-uid4\" class=\"ad-rotatable adw-620 adh-250\"></div>\n\t</div>\n\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t\t</div>\n\n\t\t\t\t\n\n<div id=\"comments-loaded\"></div>\n\t\t\t\t<div class=\"pmc-u-margin-tb-2\">\n\t\t\t\t\t\t<div class=\"admz\" id=\"adm-after-comments\">\n\t\t\t\t\t<div class=\"adma google-publisher\">\n\t\t\t\t\n<div class=\"pmc-adm-goog-pub-div \">\n\t<div id=\"div-gpt-dl-ros-620x251-uid5\" class=\"ad-rotatable adw-620 adh-251\"></div>\n\t</div>\n\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t\t</div>\n\t\t\t</div></div>",
  "author": "Lisa de Moraes",
  "date_published": "2019-04-15T13:18:34.000Z",
  "lead_image_url": "https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1024",
  "dek": null,
  "next_page_url": null,
  "url": "https://deadline.com/2019/04/donald-trump-boeing-max-737-rebrand-advice-twitter-1202595880/",
  "domain": "deadline.com",
  "word_count": 257,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

✅ All tests passed

@kik0220 kik0220 force-pushed the feat-deadline-com-extractor branch from 4c89923 to b7fb367 Compare April 21, 2019 08:06
@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: test: transform .embed-twitter

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Donald Trump Advises Boeing To Rebrand Max 737, Tweeting “But What The Hell Do I Know?”; Twitter Answers",
  "content": "<div><div class=\"pmc-a-grid-item pmc-a-span2@desktop u-max-width-100p\">\n\t\t\t\t<figure class=\"c-figure u-border-b-1 u-border-color-grey-medium-light\">\n\t<img width=\"450\" src=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1\" class=\"c-figure__image\" alt=\"Donald Trump\" srcset=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1000&amp;h=563&amp;crop=1 1000w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=910&amp;h=511&amp;crop=1 910w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=681&amp;h=383&amp;crop=1 681w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1 450w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=250&amp;h=140&amp;crop=1 250w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=225&amp;h=225&amp;crop=1 225w\" sizes=\"(min-width: 87.5rem) 1000px, (min-width: 78.75rem) 681px, (min-width: 48rem) 450px, (max-width: 48rem) 250px\">\n\t\t<figcaption class=\"c-figure__caption u-flex u-flex-direction-column pmc-u-font-family-helvetica pmc-u-padding-tb-025\">\n\n\t\t\t\n\t\t\t\t\t\t\t<span class=\"pmc-u-color-grey-medium-dark pmc-u-font-weight-light pmc-u-font-size-12\">\n\t\t\t\t\tAndrew Harnik/AP/Shutterstock\t\t\t\t</span>\n\t\t\t\n\t\t</figcaption>\n\n\t</figure>\n\n\t\t\t\t<div class=\"a-content pmc-u-line-height-copy pmc-u-font-family-georgia pmc-u-font-size-16 pmc-u-font-size-18@desktop\">\n\t\t\t\t\t<p><a href=\"https://deadline.com/tag/twitter/\" id=\"auto-tag_twitter\">Twitter</a> erupted Monday morning when President <a href=\"https://deadline.com/tag/donald-trump/\" id=\"auto-tag_donald-trump\">Donald Trump</a> shared his branding expertise with <a href=\"https://deadline.com/tag/boeing/\" id=\"auto-tag_boeing\">Boeing</a>, after American Airlines announced it was cancelling Boeing&#x2019;s Max 737 flights through mid-August. That after Southwest Airlines, the largest operator of Boeing jets, canceled its Max flights through 5 August.</p>\n\n<p>The U.S. was one of the last countries to ground the plane last month after a second deadly crash in Ethiopia, following by months a crash in Indonesia. Since the second Ethiopian Airlines crash, the stock has plunged 12% and cutting back production, lost 24$ from its market cap.</p>\n<p>&#x201C;What do I know about branding, maybe nothing (but I did become President!),&#x201D; Trump simpered in a morning tweet.</p>\n\n<p>&#x201C;[B]ut if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name,&#x201D; he advised.</p>\n<p>&#x201C;No product has suffered like this one. But again, what the hell do I know?&#x201D; Trump said which, given his history with defunct Trump Airlines and other projects, triggered shooting-fish-in-barrel responses on Twitter, predictably taking &#x201C;Trump Airlines&#x201D; to Top-10 trending status worldwide. Among the responses:</p>\n\n<blockquote class=\"twitter-tweet\">\n<p>Trump Airlines. Now there&apos;s a brand. <a href=\"https://twitter.com/realDonaldTrump?ref_src=twsrc%5Etfw\">@realDonaldTrump</a> <a href=\"https://t.co/xdcgcjohW4\">https://t.co/xdcgcjohW4</a></p>\n<p>&#x2014; Brian J. Karem (@BrianKarem) <a href=\"https://twitter.com/BrianKarem/status/1117767646961963009?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n\n\n<blockquote class=\"twitter-tweet\">\n<p>Before Trump tries to act as though he knows how to fix Boeing, let&#x2019;s all review what happened to Trump Airlines, also known as &#x2018;Trump Shuttle&#x2019;.<a href=\"https://twitter.com/HillReporter?ref_src=twsrc%5Etfw\">@hillreporter</a><a href=\"https://t.co/EFacdLsB9m\">https://t.co/EFacdLsB9m</a></p>\n<p>&#x2014; Ed Krassenstein (@EdKrassen) <a href=\"https://twitter.com/EdKrassen/status/1117773751217795078?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n\n<p>Trump&#x2019;s tweet:</p>\n\n<blockquote class=\"twitter-tweet\">\n<p>What do I know about branding, maybe nothing (but I did become President!), but if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name.<br>No product has suffered like this one. But again, what the hell do I know?</p>\n<p>&#x2014; Donald J. Trump (@realDonaldTrump) <a href=\"https://twitter.com/realDonaldTrump/status/1117736685721223168?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n\n\t\t\t\t</div>\n\n\t\t\t\t<p class=\"pmc-u-text-align-center\">Subscribe to <a href=\"https://pages.email.deadline.com/signup\" class=\"pmc-u-font-weight-bold\">Deadline Breaking News Alerts</a> and keep your inbox happy.</p><div class=\"article-tags u-flex u-align-items-center u-flex-direction-column@mobile-max u-justify-content-center pmc-u-margin-b-1\">\n\t<span class=\"c-label  pmc-u-font-size-16 pmc-u-margin-r-025\">\n\n\tRead More About:\n</span>\n\t<nav class=\"o-nav o-nav--horizontal \">\n\n\t\n\t\n</nav>\n</div>\n\n\n\n\t\t\t\t\n\n\t\t\t\t\n\n\n\t\t\t\t\n\t\t\t</div></div>",
  "author": "Lisa de Moraes",
  "date_published": "2019-04-15T13:18:34.000Z",
  "lead_image_url": "https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1024",
  "dek": null,
  "next_page_url": null,
  "url": "https://deadline.com/2019/04/donald-trump-boeing-max-737-rebrand-advice-twitter-1202595880/",
  "domain": "deadline.com",
  "word_count": 257,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

✅ All tests passed

@kik0220 kik0220 force-pushed the feat-deadline-com-extractor branch from b7fb367 to f350cc8 Compare April 24, 2019 11:46
@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: fix: regenerate the fixture and fix content selector

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Donald Trump Advises Boeing To Rebrand Max 737, Tweeting “But What The Hell Do I Know?”; Twitter Answers",
  "content": "<div><article class=\"pmc-a-grid-item pmc-a-span2@desktop u-max-width-100p\">\n\t\t\t\t<figure class=\"c-figure u-border-b-1 u-border-color-grey-medium-light\">\n\t<img width=\"450\" src=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1\" class=\"c-figure__image\" alt=\"Donald Trump\" srcset=\"https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1000&amp;h=563&amp;crop=1 1000w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=910&amp;h=511&amp;crop=1 910w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=681&amp;h=383&amp;crop=1 681w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=450&amp;h=253&amp;crop=1 450w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=250&amp;h=140&amp;crop=1 250w, https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=225&amp;h=225&amp;crop=1 225w\" sizes=\"(min-width: 87.5rem) 1000px, (min-width: 78.75rem) 681px, (min-width: 48rem) 450px, (max-width: 48rem) 250px\">\n\t\t<figcaption class=\"c-figure__caption u-flex u-flex-direction-column pmc-u-font-family-helvetica pmc-u-padding-tb-025\">\n\n\t\t\t\n\t\t\t\t\t\t\t<span class=\"pmc-u-color-grey-medium-dark pmc-u-font-weight-light pmc-u-font-size-12\">\n\t\t\t\t\tAndrew Harnik/AP/Shutterstock\t\t\t\t</span>\n\t\t\t\n\t\t</figcaption>\n\n\t</figure>\n\n\t\t\t\t<div class=\"a-content pmc-u-line-height-copy pmc-u-font-family-georgia pmc-u-font-size-16 pmc-u-font-size-18@desktop\">\n\t\t\t\t\t<p><a href=\"https://deadline.com/tag/twitter/\" id=\"auto-tag_twitter\">Twitter</a> erupted Monday morning when President <a href=\"https://deadline.com/tag/donald-trump/\" id=\"auto-tag_donald-trump\">Donald Trump</a> shared his branding expertise with <a href=\"https://deadline.com/tag/boeing/\" id=\"auto-tag_boeing\">Boeing</a>, after American Airlines announced it was cancelling Boeing&#x2019;s Max 737 flights through mid-August. That after Southwest Airlines, the largest operator of Boeing jets, canceled its Max flights through 5 August.</p>\n<p>The U.S. was one of the last countries to ground the plane last month after a second deadly crash in Ethiopia, following by months a crash in Indonesia. Since the second Ethiopian Airlines crash, the stock has plunged 12% and cutting back production, lost 24$ from its market cap.</p>\n<p>&#x201C;What do I know about branding, maybe nothing (but I did become President!),&#x201D; Trump simpered in a morning tweet.</p>\n\n<p>&#x201C;[B]ut if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name,&#x201D; he advised.</p>\n<p>&#x201C;No product has suffered like this one. But again, what the hell do I know?&#x201D; Trump said which, given his history with defunct Trump Airlines and other projects, triggered shooting-fish-in-barrel responses on Twitter, predictably taking &#x201C;Trump Airlines&#x201D; to Top-10 trending status worldwide. Among the responses:</p>\n\n<blockquote class=\"twitter-tweet\">\n<p>Trump Airlines. Now there&apos;s a brand. <a href=\"https://twitter.com/realDonaldTrump?ref_src=twsrc%5Etfw\">@realDonaldTrump</a> <a href=\"https://t.co/xdcgcjohW4\">https://t.co/xdcgcjohW4</a></p>\n<p>&#x2014; Brian J. Karem (@BrianKarem) <a href=\"https://twitter.com/BrianKarem/status/1117767646961963009?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n\n\n<blockquote class=\"twitter-tweet\">\n<p>Before Trump tries to act as though he knows how to fix Boeing, let&#x2019;s all review what happened to Trump Airlines, also known as &#x2018;Trump Shuttle&#x2019;.<a href=\"https://twitter.com/HillReporter?ref_src=twsrc%5Etfw\">@hillreporter</a><a href=\"https://t.co/EFacdLsB9m\">https://t.co/EFacdLsB9m</a></p>\n<p>&#x2014; Ed Krassenstein (@EdKrassen) <a href=\"https://twitter.com/EdKrassen/status/1117773751217795078?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n\n<p>Trump&#x2019;s tweet:</p>\n\n<blockquote class=\"twitter-tweet\">\n<p>What do I know about branding, maybe nothing (but I did become President!), but if I were Boeing, I would FIX the Boeing 737 MAX, add some additional great features, &amp; REBRAND the plane with a new name.<br>No product has suffered like this one. But again, what the hell do I know?</p>\n<p>&#x2014; Donald J. Trump (@realDonaldTrump) <a href=\"https://twitter.com/realDonaldTrump/status/1117736685721223168?ref_src=twsrc%5Etfw\">April 15, 2019</a></p></blockquote>\n\n\t\t\t\t</div>\n\n\t\t\t\t<p class=\"pmc-u-text-align-center\">Subscribe to <a href=\"https://pages.email.deadline.com/signup\" class=\"pmc-u-font-weight-bold\">Deadline Breaking News Alerts</a> and keep your inbox happy.</p><div class=\"article-tags u-flex u-align-items-center u-flex-direction-column@mobile-max u-justify-content-center pmc-u-margin-b-1\">\n\t<span class=\"c-label  pmc-u-font-size-16 pmc-u-margin-r-025\">\n\n\tRead More About:\n</span>\n\t<nav class=\"o-nav o-nav--horizontal \">\n\n\t\n\t\n</nav>\n</div>\n\n\n\n\t\t\t\t\n\n\t\t\t\t\n\n\n\t\t\t\t\n\t\t\t</article></div>",
  "author": "Lisa de Moraes",
  "date_published": "2019-04-15T13:18:34.000Z",
  "lead_image_url": "https://pmcdeadline2.files.wordpress.com/2019/01/donald-trump-2.jpg?w=1024",
  "dek": null,
  "next_page_url": null,
  "url": "https://deadline.com/2019/04/donald-trump-boeing-max-737-rebrand-advice-twitter-1202595880/",
  "domain": "deadline.com",
  "word_count": 259,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

✅ All tests passed

@toufic-m toufic-m merged commit a38c727 into postlight:master Apr 24, 2019
@kik0220 kik0220 deleted the feat-deadline-com-extractor branch April 24, 2019 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants