Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added "exactcleanstrings" advanced setting. #19219

Closed
wants to merge 1 commit into from
Closed

Added "exactcleanstrings" advanced setting. #19219

wants to merge 1 commit into from

Conversation

bpoxy
Copy link
Contributor

@bpoxy bpoxy commented Feb 12, 2021

Description

Modified cleanstrings to remove only the regex match (rather than the match onward), if it is at the beginning of the string.
Added exactcleanstrings advanced setting that removes only the regex match.

Motivation and Context

This enables prefix removal while preserving the existing suffix removal behavior.
This is an alternative to the the existing cleanstrings advanced setting, which removes the regex match and everything after it.
The motivation is to enable more precise string cleaning (such as prefix removal).

How Has This Been Tested?

Applied this to the Leia branch and compiled for Windows x64 then tested 3 cases:
1. Match at beginning of movie name.
2. Match partway through movie name.
3. No match in movie name.

advancedsettings.xml:
<advancedsettings>
<video>
<cleanstrings>
<regexp>BD</regexp>
</cleanstrings>
</video>
</advancedsettings>

1. BD 50 First Dates (2004).mp4
2. 7 Days in Hell (2015) BD HEVC.mp4
3. 40 Days and 40 Nights (2002).mp4

All movies were successfully scraped in UI with log providing confirmation:
2021-02-12 10:19:09.678 T:12620 DEBUG: ADDON::CScraper::FindMovie: Searching for '7 Days in Hell' using The Movie Database scraper (path: 'C:\Users\Bejhan\AppData\Roaming\Kodi\addons\metadata.themoviedb.org', content: 'movies', version: '5.2.6')
...
2021-02-12 10:19:10.622 T:12620 DEBUG: VideoInfoScanner: Adding new item to movies:smb://BEJHAN-PC/Videos/Test/7 Days in Hell (2015) BD HEVC.mp4
...
2021-02-12 10:19:11.634 T:12620 DEBUG: ADDON::CScraper::FindMovie: Searching for '40 Days and 40 Nights' using The Movie Database scraper (path: 'C:\Users\Bejhan\AppData\Roaming\Kodi\addons\metadata.themoviedb.org', content: 'movies', version: '5.2.6')
...
2021-02-12 10:19:12.606 T:12620 DEBUG: VideoInfoScanner: Adding new item to movies:smb://BEJHAN-PC/Videos/Test/40 Days and 40 Nights (2002).mp4
...
2021-02-12 10:19:13.525 T:12620 DEBUG: ADDON::CScraper::FindMovie: Searching for '50 First Dates' using The Movie Database scraper (path: 'C:\Users\Bejhan\AppData\Roaming\Kodi\addons\metadata.themoviedb.org', content: 'movies', version: ~'5.2.6')
...
2021-02-12 10:19:13.865 T:12620 DEBUG: VideoInfoScanner: Adding new item to movies:smb://BEJHAN-PC/Videos/Test/BD 50 First Dates (2004).mp4

Applied this to the Leia branch and compiled for Windows x64 then tested 4 cases:

  1. Match at beginning of movie name.
  2. Match partway through movie name.
  3. Match at the end of movie name.
  4. No match in movie name.

advancedsettings.xml:

<advancedsettings>
	<video>
		<exactcleanstrings>
			<regexp>BD</regexp>
		</exactcleanstrings>
	</video>	
</advancedsettings>
  1. BD 50 First Dates (2004).mp4
  2. 8 BD Mile (2002).mp4
  3. 7 Days in Hell (2015) BD.mp4
  4. 40 Days and 40 Nights (2002).mp4

All movies were successfully scraped in UI with log providing confirmation:

2021-02-28 20:43:56.104 T:7148   DEBUG: ADDON::CScraper::FindMovie: Searching for '7 Days in Hell' using The Movie Database scraper (path: 'C:\Users\Bejhan\AppData\Roaming\Kodi\addons\metadata.themoviedb.org', content: 'movies', version: '5.2.6')
...
2021-02-28 20:43:57.071 T:7148   DEBUG: VideoInfoScanner: Adding new item to movies:smb://BEJHAN-PC/Videos/Test Movies/7 Days in Hell (2015) BD.mp4
...
2021-02-28 20:43:57.103 T:7148   DEBUG: ADDON::CScraper::FindMovie: Searching for '8  Mile' using The Movie Database scraper (path: 'C:\Users\Bejhan\AppData\Roaming\Kodi\addons\metadata.themoviedb.org', content: 'movies', version: '5.2.6')
...
2021-02-28 20:43:57.450 T:7148   DEBUG: VideoInfoScanner: Adding new item to movies:smb://BEJHAN-PC/Videos/Test Movies/8 BD Mile (2002).mp4
...
2021-02-28 20:43:57.468 T:7148   DEBUG: ADDON::CScraper::FindMovie: Searching for '40 Days and 40 Nights' using The Movie Database scraper (path: 'C:\Users\Bejhan\AppData\Roaming\Kodi\addons\metadata.themoviedb.org', content: 'movies', version: '5.2.6')
...
2021-02-28 20:43:58.235 T:7148   DEBUG: VideoInfoScanner: Adding new item to movies:smb://BEJHAN-PC/Videos/Test Movies/40 Days and 40 Nights (2002).mp4
...
2021-02-28 20:43:58.258 T:7148   DEBUG: ADDON::CScraper::FindMovie: Searching for '50 First Dates' using The Movie Database scraper (path: 'C:\Users\Bejhan\AppData\Roaming\Kodi\addons\metadata.themoviedb.org', content: 'movies', version: '5.2.6')
...
2021-02-28 20:43:58.650 T:7148   DEBUG: VideoInfoScanner: Adding new item to movies:smb://BEJHAN-PC/Videos/Test Movies/BD 50 First Dates (2004).mp4

Screenshots (if appropriate):

Types of change

  • Bug fix (non-breaking change which fixes an issue)
  • Clean up (non-breaking change which removes non-working, unmaintained functionality)
  • Improvement (non-breaking change which improves existing functionality)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that will cause existing functionality to change)
  • Cosmetic change (non-breaking change that doesn't touch code)
  • None of the above (please explain below)

Checklist:

  • My code follows the Code Guidelines of this project
  • My change requires a change to the documentation, either Doxygen or wiki
  • I have updated the documentation accordingly
  • I have read the Contributing document
  • I have added tests to cover my change
  • All new and existing tests passed

@bpoxy
Copy link
Contributor Author

bpoxy commented Feb 22, 2021

I see that @KarellenX has added the "Wiki: Needed" label.
Does this imply that I have to update the wiki before this pull request can be approved (if so, how do I go about doing that?) or is it just indicating that the wiki needs to be updated if/when my change is merged?

@KarellenX
Copy link
Member

Does this imply that I have to update the wiki before this pull request can be approved

No. It just means when/if it is merged, then a wiki edit is required to note the change, which can be done by anyone.

Approving this PR is not dependent on the wiki edit.

@Karlson2k
Copy link
Member

Maybe it makes sense to make more flexible and add the new advancedsettings parameter (to not break existing configs)?

@bpoxy
Copy link
Contributor Author

bpoxy commented Feb 25, 2021

Maybe it makes sense to make more flexible and add the new advancedsettings parameter (to not break existing configs)?

Previously, if the regex match was at the beginning of the file name, the entire file name would be removed.
Since doing that would be pointless, this change is not really breaking existing configs.

@garbear
Copy link
Member

garbear commented Feb 26, 2021

It'd be nice if could update a test case, but it looks like we have no unit tests for CUtil

@bpoxy
Copy link
Contributor Author

bpoxy commented Feb 26, 2021

It'd be nice if could update a test case, but it looks like we have no unit tests for CUtil

It seems there are actually some unit tests for CUtil in kodi\xbmc\TestUtil.cpp (though not for CUtil::CleanString()).
In order to facilitate a unit test for CUtil::CleanString(), the method signature would have to be modified to accept std::shared_ptr<CAdvancedSettings> advancedSettings (rather than internally retrieving it) so that it can be stubbed in the unit test.

Do you want me to go ahead with that?

@Karlson2k
Copy link
Member

Karlson2k commented Feb 26, 2021

The proposed changes introduce ambiguity.

It will be harder to document cleanup behaviour (which is already confusing) and may confuse users even more.

I think it's match better to add a new parameter, like <regexp2>, with regexp applicable to any part of the string. It will be more universal and will cover start of the string as well by ^BD (from example in PR description). Additionally it will be simpler to document and understand.

@bpoxy
Copy link
Contributor Author

bpoxy commented Mar 1, 2021

The proposed changes introduce ambiguity.

It will be harder to document cleanup behaviour (which is already confusing) and may confuse users even more.

I think it's match better to add a new parameter, like <regexp2>, with regexp applicable to any part of the string. It will be more universal and will cover start of the string as well by ^BD (from example in PR description). Additionally it will be simpler to document and understand.

It was easier to to add a separate exactcleanstrings advanced setting, rather than adding another child under the existing cleanstrings advanced setting, so I went that route. I updated my commit and the PR description, accordingly.

@bpoxy bpoxy changed the title Modified cleanstrings to support prefix removal. ~Modified cleanstrings to support prefix removal.~ Mar 1, 2021
@bpoxy bpoxy changed the title ~Modified cleanstrings to support prefix removal.~ Modified cleanstrings to support prefix removal. Mar 1, 2021
@Karlson2k
Copy link
Member

@bejhanj Thank you, looks good.

It is a good idea to not allow the entire string to be erased.
Maybe it worth to log entire string match at LOGDEBUG level?

@garbear Can you comment on #19219 (comment)?

@Karlson2k Karlson2k added Component: Video Type: Improvement non-breaking change which improves existing functionality v20 Nexus labels Mar 1, 2021
@bpoxy bpoxy changed the title Modified cleanstrings to support prefix removal. Added "exactcleanstrings" advanced setting (as a more flexible alternative to "cleanstrings"). Mar 1, 2021
@bpoxy bpoxy changed the title Added "exactcleanstrings" advanced setting (as a more flexible alternative to "cleanstrings"). Added "exactcleanstrings" advanced setting. Mar 1, 2021
…atch (rather than removing the match and everything after as the "cleanstrings" advanced setting does).
@bpoxy
Copy link
Contributor Author

bpoxy commented Mar 1, 2021

It is a good idea to not allow the entire string to be erased.
Maybe it worth to log entire string match at LOGDEBUG level?

Good idea, done.

@bpoxy
Copy link
Contributor Author

bpoxy commented Mar 19, 2021

Still waiting on @garbear to respond... @Karlson2k, any thoughts on this?

@garbear
Copy link
Member

garbear commented Mar 23, 2021

Looks better. My 2 comments:

  • Can the debug line result in log spew, such as for a large collection? Maybe this is ok if it's only a single spew (like game input)
  • It looks like there's some style differences, normally Jenkins detects these. Give it a try to match our code style, and I'll chime in with code for you if there's any more differences

Then it gets my +1!

@bpoxy
Copy link
Contributor Author

bpoxy commented Mar 26, 2021

  • Can the debug line result in log spew, such as for a large collection? Maybe this is ok if it's only a single spew (like game input)

Yes, it could result in log spew if A) a regex is invalid (which is no different than the way cleanstrings already behaves) or B) a regex matches the entire string for multiple items in the collection. Any suggestions of how to avoid the log spew?

  • It looks like there's some style differences, normally Jenkins detects these. Give it a try to match our code style, and I'll chime in with code for you if there's any more differences

I can't seem to spot the style differences, could you point them out please?

@bpoxy bpoxy closed this May 29, 2022
@garbear
Copy link
Member

garbear commented May 29, 2022

It looks like it was on me to finish providing feedback and I let this PR fall through the cracks. Sorry about that @bejhan , and if you re-open, I'll get this in.

@bpoxy
Copy link
Contributor Author

bpoxy commented May 30, 2022

It looks like it was on me to finish providing feedback and I let this PR fall through the cracks. Sorry about that @bejhan , and if you re-open, I'll get this in.

No worries @garbear, it happens; I could've followed up as well. I actually no longer have a need for this feature as I've since conformed to Kodi's naming conventions. However, I can reopen if you think this feature would be of use to others.

@garbear
Copy link
Member

garbear commented May 30, 2022

Maybe make a post on the forum about it, see if anyone has input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Video Type: Improvement non-breaking change which improves existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants