Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal][Feature] Convert DMM's ContentID to standard ID #112

Merged
merged 6 commits into from
Sep 29, 2020

Conversation

seeyabye
Copy link
Collaborator

@seeyabye seeyabye commented Sep 26, 2020

The current ID is rather inconsistent with all other web providers. For instance, DMM uses a format of ID##### where # represents some kind of numeric value while others like Javlibrary uses the standard ID-###. Additionally, there's also no way to customize which type of ID I would like to use.

This exhibits a problem because I would prefer to have it use ID-### rather than ID####, but there is no way to reflect this choice in the settings file.

In this PR, I would like to propose to standardize the ID format to ID-### and keep ID##### as a separate ID, called ContentID. This would also allow customization in sort.format.file and others to support both <ID> and <CONTID>.

There are two ways to go about this problem.

  1. Include a new field in $Data to support ContentId and Id. Possibly, include another field that keeps the original scraped id, called ScrapeId
  2. Keep the original data structure, but make modifications to Convert-JVString to support <CONTID>.

Each of the above choices has its own repercussions:

  1. Requires refactoring all aggregators to include the additional field for support
  2. It's more tricky to maintain the long run because it's the responsibility of Convert-JVString to cover all use cases, especially when new aggregators are added. At the moment, I don't see a large deviation in ID patterns.

Right now, I think either choice is fine, but I will leave the final decision to you.

What do you think?

Personally, I have gone with the first choice, but if need be, I can implement the second choice.

With that, I've only made changes to DMM as a prototype, but if this proposal is accepted, I will continue to make changes for the others depending on the choice.

Checklist:

  • Implement Id and ContentId for DMM.
  • Implement ContentId for all other aggregators

@jvlflame jvlflame added the enhancement New feature or request label Sep 26, 2020
@jvlflame
Copy link
Collaborator

jvlflame commented Sep 26, 2020

This is a good idea. How I originally envisioned users using Javinizer would be using scrapers in conjunction with each other to grab the necessary metadata fields (i.e. dmm for description, r18 for id, etc.). Because Dmm didn't natively support ID, I didn't include it and assumed a user would just use a different scraper to grab the ID. But you bring up a good point of standardizing the ID format to match the rest of the scrapers.

There are two ways to go about this problem.

  1. Include a new field in $Data to support ContentId and Id. Possibly, include another field that keeps the original scraped id, called ScrapeId
  2. Keep the original data structure, but make modifications to Convert-JVString to support .

Definitely option 1 like you already implemented. This is similar to how the R18 scraper is already working.

With that, I've only made changes to DMM as a prototype, but if this proposal is accepted, I will continue to make changes for the others depending on the choice.

While I think it's feasible to convert from ContentId => Id, I don't believe the other way around is possible, at least not in the original format that dmm uses with things like h_###aaa00123. I believe most people will stick with the standardized DVD Id for their naming, and if they do want to use ContentId, they would still be able to with the Dmm and R18 scrapers.

Let's just:

  • Add the Id to the Dmm scraper without adding ContentId to the others (Done, pending review)
  • Add setting to prefer the ID type
    • No fallback to the alternate ID type if the preferred is unavailable. I think people would prefer that their whole library follows a consistent schema in this case. If it automatically fell back to the alternative, it would break that consistency.
  • Update Convert-JVString to support <CONTENTID> <- Let's use the full name for the tag rather than CONTID

src/Javinizer/Private/Scraper.Dmm.ps1 Outdated Show resolved Hide resolved
@seeyabye
Copy link
Collaborator Author

I have presently made the following changes:

  • Added Id Preference setting to the configuration file. Users can choose to have id or contentid for DMM and R18. By default, it is set to id which corresponds to the ID-### format.
  • I have also included <CONTENTID>
  • Made changes to DMM Scraper so that age_check_done is presented to both DMM-Ja and DMM-En. I found times when DMM-Ja is forcing me to do the age verification before this change.
  • Fixed a bug related to label, maker, and series not returning any results for DMM-Ja

Remark
I've currently embedded the logic for choosing ID or ContentId in R18 and DMM's scraper. This means $movieDataObject.Id will return ID-### or ID##### depending on the preference setting configured.

If there's any other preferred way to do this, please let me know.

@seeyabye seeyabye marked this pull request as ready for review September 27, 2020 08:13
src/Javinizer/Public/Get-JVData.ps1 Show resolved Hide resolved
src/Javinizer/Public/Get-JVData.ps1 Show resolved Hide resolved
@@ -45,7 +48,7 @@ function Get-DmmData {
$movieDataObject = [PSCustomObject]@{
Source = if ($Url -match '/en/') { 'dmm' } else { 'dmmja' }
Url = $Url
Id = Get-DmmId -WebRequest $webRequest
Id = if ($IdPreference -eq "id") { Get-DmmId -WebRequest $webRequest } elseif ($IdPreference -eq "contentid") { Get-DmmContentId -WebRequest $webRequest }
Copy link
Collaborator

@jvlflame jvlflame Sep 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a bit of thinking about having the Id convert to ContentId directly on the scrapers themselves.

I think it would be better to change it only on the aggregated data object, since if people were using Javinizer solely for the scraper functions, they wouldn't want the original scraper output to be changed depending on an Id preference.

My approach would be to consolidate the settings into a single setting scraper.option.idpreference and set the Id during the foreach loop that assigns all the aggregated data fields.

foreach ($field in $metadataFields) {
    $metadataPriority = (Get-Variable -Name "$($field)Priority" -ValueOnly)
    foreach ($priority in $metadataPriority) {
        $sourceData = $Data | Where-Object { $_.Source -eq $priority }

        if ($null -eq $aggregatedDataObject.$field) {
            if ($field -eq 'AlternateTitle') {
                $aggregatedDataObject.$field = $sourceData.Title
            } elseif ($field -eq 'Id') {
                if ($IdPreference -eq 'contentid') {
                    $aggregatedDataObject.$field = $sourceData.ContentId
                } else {
                    $aggregatedDataObject.$field = $sourcedata.Id
                }
            } else {
                $aggregatedDataObject.$field = $sourceData.$field
            }
            Write-JVLog -Write:$script:JVLogWrite -LogPath $script:JVLogPath -WriteLevel $script:JVLogWriteLevel -Level Debug -Message "[$($Data[0].Id)] [$($MyInvocation.MyCommand.Name)] [$field - $priority] Set to [$($sourceData.$field | ConvertTo-Json -Compress)]"
        }
    }
}

What do you think of this approach?

Copy link
Collaborator Author

@seeyabye seeyabye Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like this approach.

To be honest, I wasn't quite pleased with what I have done. I prefer not to have the logic encoded in the model (in this case $movieDataObject). I will include this in the changes. Thank you for the suggestion!

Copy link
Collaborator

@jvlflame jvlflame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@jvlflame jvlflame added this to In progress in 2.1.0 via automation Sep 29, 2020
@jvlflame jvlflame merged commit 79bef59 into javinizer:dev Sep 29, 2020
2.1.0 automation moved this from In progress to Done Sep 29, 2020
@jvlflame jvlflame mentioned this pull request Sep 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
2.1.0
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants