-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems matching scenes to files #1092
Comments
because... SLRSexBabes is NOT the studio. The studio in your example is SexBabesVR. |
I had mistyped SexBabesVR as SLRSexBabes in my original post. I closed this issue because I thought I'd let this drop until I did more homework. I've done that and I still have the problem. Here are details on the file/scene not matching ...
I have enabled and scraped SexBabesVR multiple times. I've rescanned the files. I just cannot get any match. This is the HTML for the entry of the scene in the SLR website. I assume this is what you scrape ...
The scene shows up as gray in heresphere but plays when clicked. This makes sense. |
SexBabes is not scrapped from SLR, it is scrapped from SexbabesVR, so SexBabesVR Scenes downloaded from SLR will have to be matched manually. If you look in the Scene Details/Edit Scene (pencil icon) there is a tab for the Filenames that will match automatically. |
Ah, that makes sense but it is a real pita. Of the first 80 scenes downloaded 20 didn't match. Aggregators like SLR have a zillion scenes from other sites and manually matching all of them is a lot of work. This is a flaw in the system. I have two possible solutions that I would like to be considered feature requests.
I hope I'm not coming off as a jerk. I make the suggestions as friendly advice. BTW, I'm a retired programmer. Do you think pull requests in this area would be welcome? I have already used close-matching code in an app I wrote to match downloaded videos for movies and TV to file names. |
I'm a retired programmer as well (unless something interesting comes along). As of 6 months ago, I had never used xbvr and had never developed using of the technology platforms it is built on. So, I think absolutely I think PRs would be welcome, I think I'm up to about 50 now across all sorts of area. New people also bring new ideas and perspectives, and there is certainly a variety of how people want to use the package. I have done something along the lines of matching for myself. I match all scenes from SLR and VRPorn to xbvr scenes that are scraped from their original site. I would say my automated matching is about 98-99%. One of my first PRs was to change the Search to allow you to target specific searches for specific words to a specific field, i.e., site, cast, title or synopsis, which is the basis of my matching, I essentially build a huge search query with words from the other source, targeting those fields in xbvr, e.g., instead of just searching "Some scene title-An Actor" I search"title:Some title:scene title:title cast:an cast:actor". However, even with what I think is a very high match success rate, I'm not sure I'll every submit it. That 1 or 2% would translate to 1 or 2 posts a week on Discord asking why it didn't match. And that's just noise I don't want really want to deal with. I don't think you would every get to 100%. There just so many exceptions e.g.
Don't get me wrong, it's a good idea, my main concern would be people realizing and accepting it's not 100% and not creating an extra support requirement. Ideally, a crowd source approach to correcting mismatches would work, but XBVR is all hosted locally with no remote/web based service where you could facilitate that. I would probably have tackled a solution, if I had time to also build a ui to allow people to sort their own mismatches, but I suspect that's as much work as the actual matching process. I don't mean to seem like I'm rubbishing the idea or even saying don't do it, matching scenes between sites has huge benefits for other things as well, after all I have bult something for myself. Just highlighting some other considerations to think about. |
Wow. I had no idea things were such a mess. My pollyanna idea was that an aggregator scene was just a pointer to the original studio's scene. I had no idea it was manually copied. That eliminates my idea 2 of smart matching. That leaves idea 1 which is to just scrape the aggregator like SLR and leave the original studio out of the equation. I just looked at the HTML I posted above and I noticed that the file name is not in there. The link is to Idea 1 would be easier and more reliable than 2. But would the benefits outweigh the effort? I might code it for SLR and CzechVR because those are the only two I care about. Some other retired programmer could do the others :-) If a pull request is ignored it wouldn't be a big deal. |
My 2 cents, if you have the time, give it a try, if it all works out you have a PR that benefits others. If not, you may still have something more specific that meets your own personal needs, but you still learn about how to make changes to xbvr which could be applied to other PR's. |
OK, things are worse than I thought. When I try to match, which I haven't tried before, there is no match. I rescanned the files and rescraped. I tested two files. Any idea what could cause this? |
I'm not sure what the problem is? Are you just clicking the "Match" button and leaving the actual search box that comes up as is? You might want to change the search query. I took a SexBabesVR file with SexBabesVR's filename, It was the first option. Repeating the match search with the search entry as just the scene title increases the match "score" |
How did you know to add 29264? Did you somehow cheat? :-) With the full file name I got five pages with no match. When I reduced the search to just "bliss" I got 2 pages of results down from 5, and neither page had bliss in the title. This is backwards to most searches which show more results as you reduce the search term. Eventually reducing it to empty should show all scenes. When I put in "pure bliss" I got the original 5 pages. This is baffling to me as it doesn't make sense. Could search be broken? I guess I can watch scenes just based on the filenames with no other information. Bummer. |
I just had a thought (they are rare nowadays). I originally scanned the files and then tried to match. I realized later that I had never scraped SexBabesVR. After scraping it I rescanned the files. The match was still not found. Is it possible that scraping after file scanning creates a permanent mismatch? In other words the filenames are only matched to earlier scrapes? |
The default download file was called, SLR_SexBabesVR_Purely Temptatious_2700p_29264_LR_180.mp4. If I need to manually match SLR files, as you would with Sexbabesvr, you click match. At this point don't be hung up on the name of the file, the search term when you click Match is pre-populated with the filename but that just a convience thing to save some typing. Other than that, the filename is no longer relavant. Change the search term to find the scene you want, I usually get rid of everything from the filename except site and title, i.e. I would search "SexBabesVR Purely Temptatious", that usually enough to match, if not I may add the Actors name or prefix words from the title with title:. When you hit Match it's not about making the filename match, at this point auto matching based on filename has already failed, it's now just about finding the scene you want. Assuming you find it, as in the list in vt's example, you click Assign, xbvr will link that scene to your file, even though it wasn't in it's list of valid filename. It will add the filename to the list of valid filenames for that scene in your database. If you have the scene scrapped and it's not coming up in the search results, then you may have a search index problem and should rebuild them in Options/Cache/Search Index/Reset |
How can I know that if I've never seen the scene?
I just clicked all buttons in that tab, rescanned, and rescraped. Now the search is totally broken. No results show up no matter what I put in the search box. Oh well ... |
Do you mean how can you know the Actor, SLR list them on the scene where you downloaded it.
It's probably still rebuilding the search indexes. They get locked when they are updating. A large system can take a couple of hours |
I waited 3 hrs with no luck. Now I've restarted the docker container and scanned and scraped again. Still no joy in mudville. I'm guessing the next step would be to destroy the container, reinstall it, rescan, and rescrape. I've spent enought time on this for now and will live with just filenames. I don't know what I could have done wrong to cause this. |
No lol, I just found the page for the scene on SLR, used the SLR/SBVR filename you previously mentioned, and the number I saw in the SLR URL: My file is actually from POVR, so I'd already renamed it to match SexBabesVR's own filenames to just have it match automatically as a test. It was originally named @mark-hahn , I tried a test with the scene and filename you're having trouble with. The video itself was "filler", ignore the bizarre resolution, file size, etc., I just grabbed the nearest/smallest .mp4 file I had to use as a stand in since I don't have the scene itself downloaded. The actual SexBabesVR filename matched on its own, as expected.
From within XBVR itself. "Vika P" doesn't have too many scenes anyways, so it wasn't hard to find, but do note that SBVR actually has her credited as "Aislin". I already had an "aka" in here though. I removed some of the existing filenames (there were 6-7 more) just to make sure it'd show up in the screenshot, but if a scene is still giving you a hard time, you could always just click that blue "Add item+" button, and paste in whatever filename SLR is using, and then "Save Scene Details". It's a bit more roundabout than using the "Match" function, but the next time you scan the drive the file is on, it will just automatically match. That's all clicking "Match" technically does - but in one step. It adds a new filename to that list, and then associates the file with the scene. As for my test with the "SLR filename" The correct video is there in the list, it's the 2nd result. Forgive me if you've already mentioned it, but you do already have the SexBabesVR scraper enabled, right?
Oh, indeed you have. Well, does the scene show up on the "Any" or "Not Downloaded" tabs if you filter by actress "Aislin" (or "Tiffany Tatum") and studio "SexBabesVR"? Last but not least, although I think @toshski already mentioned it, you can always try resetting the Search Index It will take some time to re-build it, and I'm probably mistaken, but I think it only triggers after you run a scraper (any scraper). PS. If the scene is missing, then it's not scraping properly, but either way, importing this content bundle will plop the scene in your library, with the SLR filename already added to it, and the AKA for Aislin. |
Thanks. I will investigate the actions you suggest if/when search is working again. |
@mark-hahn I ran into something similar. In my download collection of years I have many files which I have changed the filenames of. In some cases using a pattern like the CzechVR ones prefixed with While being in a very pragmatic mood, I created this python script https://gist.github.com/DUZszyi/1f67e46d9ffda5ebec0e0db04d69b9a1 This opens the sqlite db and just pokes in the innards of xbvr, and it seems to work for me! There are three "matchers"
Example of a match:
Example of a false positive:
Matching within Levenshtein distance threshold didn't work out for this one. Not sure if this of use to you or anyone else, but I thought it would be worth sharing. at the least it could be some inspiration for an actual built-in fuzzy matcher. I might update the gist later. |
I'd like to try it but I'm not sure how to run it inside the docker version
I'm running. I need to study up on docker and see if I can ssh into a
docker image.
…On Sun, Feb 5, 2023 at 5:01 PM DUZszyi ***@***.***> wrote:
@mark-hahn <https://github.com/mark-hahn> I ran into something similar.
In my download collection of years I have many files which I have changed
the filenames of.
In some cases using a pattern like the CzechVR ones prefixed with cvr and
CzechVR Fetish ones with cvrc. In other cases I prefixed a number
representing a date. Some others I added tags in the filenames. So matching
is a bit of a mess.
While being in a very pragmatic mood, I created this python script
https://gist.github.com/DUZszyi/1f67e46d9ffda5ebec0e0db04d69b9a1
This opens the sqlite db and just pokes in the innards of xbvr, and it
seems to work for me! There are three "matchers"
1. based on substring. Either the target file starts with the stem of
a known file or the stem of a known file is a substring in target. This
matches the cases where I added stuff.
2. special case for my cvr prefixing. might be caught by rule 1 too. I
made rule 1 later :)
3. generic Levenshtein distance. if less than 25% if the characters in
target need insertions, deletions, or substitutions to get to a known file
Not sure if this of use to you or anyone else, but I thought it would be
worth sharing. at the least it could be some inspiration for an actual
built-in fuzzy matcher.
I might update the gist later.
—
Reply to this email directly, view it on GitHub
<#1092 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGGDPZKSVAYF6J34DWKYW3WWBEPVANCNFSM6AAAAAATSPKKB4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
<For some bizzare reason this original post was erased. It was about a problem matching files to scenes. The problem is detailed again in the 3rd post below>
The text was updated successfully, but these errors were encountered: