-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleanup/slugify of schema fields #24
Comments
I'll look into it. It looks like the base sluglify package was abandoned, but python-sluglify is active, and the only fields that would need it would be post_title, name and filename, I'm thinking. |
For the fields it would probably be good to check all fields that can be used by the"--file-format" option.
|
Currently I'm not restricting it. Anything in the Attachment class is open for use. So, [filename, name, path, post_id, post_title, base_name, extension] are all valid. I've been playing with it, and have it working on my dev setup, but I'm going to add it as an optional flag so people with existing folders don't wind up with a bunch of duplicate files. I'm just messing with using it against filename without borking the extension, which happened in an early test. |
I have a branch up for it, if you want to give it a test drive. Run it through the ringer and if it feels fine I'll merge it over once I add the switch to make sure others don't double their disk usage. So far it's a pretty straight forward change: 4b303c8 |
Here's a quick preview, just got the extra switch working. A little tricky considering how removed posts is from the CLI. And I really need to re-work the cli with common options to reduce verbosity. (party-py3.11) (base) darkdragn@DESKTOP-M35OH2B:/mnt/d/src/party$ party kemono patreon kajin --file-format "{ref.post_id}_{ref.post_title}_{ref.index}.{ref.extension}" -d Kajin --limit 5 -d temp
2023-11-05 15:40:51.030 | DEBUG | party.cli:pull_user:112 - Excluded Extensions: []
⠹ User found: kajin; parsing posts...Duplicate files found, recommend using post_id
Downloading from user: kajin
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:16<00:00, 1.35s/it]
2023-11-05 15:41:11.034 | INFO | party.cli:pull_user:217 - Output status: Counter({<StatusEnum.SUCCESS: 1>: 12})
(party-py3.11) (base) darkdragn@DESKTOP-M35OH2B:/mnt/d/src/party$ jq '.' temp/.info
{
"user": {
"directory": "temp",
"id": "585637",
"indexed": "Sun, 23 Aug 2020 10:03:20 ",
"name": "kajin",
"service": "patreon",
"site": "https://kemono.party",
"updated": "Fri, 03 Nov 2023 20:12:05 ",
"url": "https://kemono.party/api/v1/patreon/user/585637"
},
"options": {
"exclude_extensions": [],
"files": true,
"exclude_external": true,
"base_url": "https://kemono.party",
"directory": "temp",
"ordered_short": false,
"file_format": "{ref.post_id}_{ref.post_title}_{ref.index}.{ref.extension}",
"sluglify": false
}
}
(party-py3.11) (base) darkdragn@DESKTOP-M35OH2B:/mnt/d/src/party$ party kemono patreon kajin --file-format "{ref.post_id}_{ref.post_title}_{ref.index}.{ref.extension}" -d Kajin --limit 5 -d temp --sluglify
2023-11-05 15:41:25.620 | DEBUG | party.cli:pull_user:112 - Excluded Extensions: []
⠇ User found: kajin; parsing posts...Duplicate files found, recommend using post_id
Downloading from user: kajin
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:14<00:00, 1.22s/it]
2023-11-05 15:41:43.372 | INFO | party.cli:pull_user:217 - Output status: Counter({<StatusEnum.SUCCESS: 1>: 12})
(party-py3.11) (base) darkdragn@DESKTOP-M35OH2B:/mnt/d/src/party$ jq '.' temp/.info
{
"user": {
"directory": "temp",
"id": "585637",
"indexed": "Sun, 23 Aug 2020 10:03:20 ",
"name": "kajin",
"service": "patreon",
"site": "https://kemono.party",
"updated": "Fri, 03 Nov 2023 20:12:05 ",
"url": "https://kemono.party/api/v1/patreon/user/585637"
},
"options": {
"exclude_extensions": [],
"files": true,
"exclude_external": true,
"base_url": "https://kemono.party",
"directory": "temp",
"ordered_short": false,
"file_format": "{ref.post_id}_{ref.post_title}_{ref.index}.{ref.extension}",
"sluglify": true
}
}
(party-py3.11) (base) darkdragn@DESKTOP-M35OH2B:/mnt/d/src/party$ ls temp
'91889193_SummerLadydevimon or SummerAngewomon?_0.jpg' '92026701_October last hours!_2.jpg'
'91889193_SummerLadydevimon or SummerAngewomon?_1.jpg' '92026701_October last hours!_3.jpg'
'91889193_SummerLadydevimon or SummerAngewomon?_2.jpg' '92026701_October last hours!_4.jpg'
91889193_summerladydevimon-or-summerangewomon_0.jpg 92026701_october-last-hours_0.jpg
91889193_summerladydevimon-or-summerangewomon_1.jpg 92026701_october-last-hours_1.jpg
91889193_summerladydevimon-or-summerangewomon_2.jpg 92026701_october-last-hours_2.jpg
'91968217_October last days!_0.jpg' 92026701_october-last-hours_3.jpg
'91968217_October last days!_1.jpg' 92026701_october-last-hours_4.jpg
91968217_october-last-days_0.jpg '92239087_My halloween cosplay_0.jpg'
91968217_october-last-days_1.jpg '92239087_My halloween cosplay_1.jpg'
'92026701_October last hours!_0.jpg' 92239087_my-halloween-cosplay_0.jpg
'92026701_October last hours!_1.jpg' 92239087_my-halloween-cosplay_1.jpg |
Merged in v0.6.7 |
There are certain characters, such as "/", "\" and probably many more, that should be removed from the schema fields.
This is because they can cause unexpected behaviour, such as creating a new folder.
This may also allow the schema fields to be compatible with the Windows file system, which does not allow certain characters for folder and file names, a similar problem to #22.
The text was updated successfully, but these errors were encountered: