Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add subscriber_count to the author object. #606

Merged
merged 7 commits into from
Apr 14, 2020
Merged

Add subscriber_count to the author object. #606

merged 7 commits into from
Apr 14, 2020

Conversation

moisout
Copy link
Contributor

@moisout moisout commented Apr 12, 2020

Add the subscriber count to the author object.

It gets converted into a number from an abbreviated form.
Support article on the abbreviation: https://support.google.com/youtube/thread/6543166?hl=en&msgid=13119244

@codecov
Copy link

codecov bot commented Apr 12, 2020

Codecov Report

Merging #606 into master will not change coverage by %.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #606   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            7         7           
  Lines          552       561    +9     
=========================================
+ Hits           552       561    +9     
Impacted Files Coverage Δ
lib/info-extras.js 100.00% <100.00%> (ø)
lib/util.js 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7020f27...b27268a. Read the comment docs.

Copy link
Owner

@fent fent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, thanks for your contrubution! I left a few comments

lib/util.js Outdated
@@ -348,9 +348,30 @@ exports.addFormatMeta = format => {
* @param {string} html
* @returns {string}
*/
exports.stripHTML = html => html
exports.stripHTML = html => decodeURIComponent(html)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting a URIError: URI malformed error from this

try calling it after this function is called

lib/util.js Outdated
Comment on lines 370 to 376
if (string.match(/[0-9,.]*[M]/)) {
return parseFloat(string.replace('M', '')) * 1000000;
} else if (string.match(/[0-9,.]*[K]/)) {
return parseFloat(string.replace('K', '')) * 1000;
} else {
return parseFloat(string);
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would look simpler if the number + the multiplier were extracted together. something like

let match = string.match(/([\d,.]+)([MK]?)/);
if (match) {
  let [num, multi] = match;
}

@@ -100,6 +104,7 @@ exports.getAuthor = body => {
user: username,
channel_url: `https://www.youtube.com/channel/${channelID}`,
user_url: `https://www.youtube.com/user/${username}`,
subscriber_count: isNaN(subscriberCount) ? null : subscriberCount,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this check to inside the parseAbbreviatedNumber() function

Copy link
Contributor Author

@moisout moisout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback.

The abbreviated number now gets matched once, and number and multiplier are extracted from the second and third match group.

Also, if it doesn't match the regex, the function returns null. The isNaN check is therefore not necessary anymore.

I also moved the decodeURIComponent outside of the stripHTML function.

lib/util.js Outdated
Comment on lines 371 to 378
.match(/([\d,.]+)([MK]?)/);
if (match) {
const [, num, multi] = match;
return multi === 'M' ? +num * 1000000 :
multi === 'K' ? +num * 1000 : +num;
}
return null;
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The abbreviated number now gets matched once, and number and multiplier are extracted from the second and third match group.

Also, if it doesn't match the regex, the function returns null. The isNaN check is therefore not necessary anymore.

lib/util.js Outdated
Comment on lines 374 to 375
return multi === 'M' ? +num * 1000000 :
multi === 'K' ? +num * 1000 : +num;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseFloat is preferred

@@ -18,7 +18,10 @@ const getMetaItem = (body, name) => util.between(body, `<meta itemprop="${name}"
exports.getVideoDescription = html => {
const regex = /<p.*?id="eow-description".*?>(.+?)<\/p>[\n\r\s]*?<\/div>/im;
const description = html.match(regex);
return description ? Entities.decode(util.stripHTML(description[1])) : '';
if (description) {
return Entities.decode(decodeURIComponent(util.stripHTML(description[1])));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, still getting the decode error with this video https://www.youtube.com/watch?v=PGqU_stcXDw

are you trying to fix URLs in descriptions? if so, then you can give the .replace calls inside stripHTML a function https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#Specifying_a_function_as_a_parameter

also, it would be out of this scope of this PR

if (description) {
return Entities.decode(decodeURIComponent(util.stripHTML(description[1])));
}
return '';
return description ? Entities.decode(util.stripHTML(description[1])) : '';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted the changes on description, and will open a separate PR with a working version.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. you can create another branch on your fork. typically, people create branches for each of their features/fixes, that way you can work on many at the same time without having to wait for one of them to merge

Comment on lines +370 to +373
let [, num, multi] = match;
num = parseFloat(num);
return multi === 'M' ? num * 1000000 :
multi === 'K' ? num * 1000 : num;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced unary + with parseFloat.

Copy link
Owner

@fent fent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@fent fent merged commit 5566272 into fent:master Apr 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants