Version 304

hydrusnetwork · Apr 25, 2018 · 926db87 · 926db87
1 parent 6ba9ce8
commit 926db87
Show file tree

Hide file tree

Showing 49 changed files with 1,602 additions and 854 deletions.
diff --git a/db/help my db is broke.txt b/db/help my db is broke.txt
@@ -16,10 +16,12 @@ Then check your hard drive's integrity.
 
 On Windows, go Start->Run (Win+R) and type 'cmd'. Type chkdsk into the new window and wait for it to scan your drive.
 
-If you find problems, then your drive has been compromised in some way, and you should view it as unreliable. If it is an old drive, you should think about buying a replacement. The exception to 'buy a new drive' is if the existing one is new, works well, and you can trace the error to a specific event, like you had an unprotected power surge during a storm that violently reset your computer. The other exception is if you cannot afford it. :/
+If it finds many problems, then your drive has been compromised in some way, and you should view it as unreliable. If it is an old drive, you should think about buying a replacement. The exception to 'buy a new drive' is if the existing one is new, otherwise works well, and you can trace the error to a specific event, such as an unprotected power surge during a storm that violently reset your computer. The other exception is if you cannot afford it. :/
 
 On Windows, tell chkdsk to fix the problems it found by running it again with the /F modifier, like 'chkdsk /F'.
 
+Another good tool is CrystalDiskInfo, which checks hard drive health at the physical level. If your drive is having trouble reading or seeking or it is piling up uncorrectable sectors, it is time to move everything off!
+
 If your hard drive is fine, please send me the details! If it could be my code breaking things, I want to know asap!
 
 
@@ -41,9 +43,14 @@ So: open the SQLite shell, which should be in the db directory, called sqlite3 o
 .open client.db
 PRAGMA integrity_check;
 
-The integrity check doesn't correct anything, but it lets you know the magnitude of the problem: if only a couple of issues are found, you may be in luck. There are several .db files in the database, and client.db may not be the one broken. If you do not know which file is already broken, try opening the other files in new shells to figure out the extent of the damage. client.mappings.db is usually the largest and busiest file in most people's databases, so it is a common victim.
+The integrity check doesn't correct anything, but it lets you know the magnitude of the problem: if only a couple of issues are found, you may be in luck. There are several .db files in the database, and client.db may not be the one broken. If you do not know which file is already broken, try opening the other files in new shells to figure out the extent of the damage. This is the same as with client.db, like so:
+
+.open client.mappings.db
+PRAGMA integrity_check;
+
+client.mappings.db is usually the largest and busiest file in most people's databases, so it is a common victim.
 
-If it doesn't look too bad, then go:
+If the errors do not look too bad, then go:
 
 .clone client_new.db
 

diff --git a/help/advanced_parents.html b/help/advanced_parents.html
@@ -20,7 +20,7 @@ <h3>tag parents</h3>
 			<p>Let's expand our weapon example:</p>
 			<p><img src="tag_parents_firearms.png" /></p>
 			<p>In that graph, adding <i>ar-15</i> to a file would also add <i>semi-automatic rifle</i>, <i>rifle</i>, and <i>firearm</i>. Searching for <i>handgun</i> would return everything with <i>m1911</i> and <i>smith and wesson model 10</i>.</p>
-			<p>This can obviously get as complicated and autistic as you like, but be careful of being too confident—this is just a fun example, but is an AK-47 truly <i>always</i> an assault rifle? Some people would say no, and beyond its own intellectual neatness, what is the purpose of attempting to create such a complicated and 'perfect' tree? Of course you can create any sort of parent tags on your local tags or your own tag repositories, but this sort of thing can easily lead to arguments between reasonable people. I only mean to say, as someone who does a lot of tag work, to try not to create anything 'perfect', as it usually ends up wasting time. Act from need, not toward purpose.</p>
+			<p>This can obviously get as complicated and autistic as you like, but be careful of being too confident--this is just a fun example, but is an AK-47 truly <i>always</i> an assault rifle? Some people would say no, and beyond its own intellectual neatness, what is the purpose of attempting to create such a complicated and 'perfect' tree? Of course you can create any sort of parent tags on your local tags or your own tag repositories, but this sort of thing can easily lead to arguments between reasonable people. I only mean to say, as someone who does a lot of tag work, to try not to create anything 'perfect', as it usually ends up wasting time. Act from need, not toward purpose.</p>
 			<h3>how you do it</h3>
 			<p>Go to <i>services->manage tag parents</i>:</p>
 			<p><img src="tag_parents_dialog.png" /></p>

diff --git a/help/changelog.html b/help/changelog.html
@@ -8,6 +8,53 @@
 		<div class="content">
 			<h3>changelog</h3>
 			<ul>
+				<li><h3>version 304</h3></li>
+				<ul>
+					<li>renamed the new 'tagcensor' object to 'tagfilter' (since it will end up doing a bunch of non-censoring jobs) and refactored it into clienttags</li>
+					<li>attached a tag filter object to all tag import options to act as a tag blacklist. all tags that go through the import pipeline (except for a couple of old legacy instances) are now checked against the blacklist, and if a bad tag is found, the file vetoes! tag import options has some new ui to handle this and background code to deal with inheritance from defaults and so on</li>
+					<li>new file import urls that have url classes, no matter their source, are now normalised!</li>
+					<li>all new file import urls are now tested against both the original and normalised version of the url, so even though previously parsed urls remain un-normalised, new urls that are pre-normalised the same will not count as new! -fingers crossed-</li>
+					<li>on update, the db will get normalised copies of all existing urls. this means many files will now have two versions of its urls--some ui to collapse everything down to only the normalised version (after some human eyes have passed in front of this big change) will come in the coming weeks</li>
+					<li>some sites where normalisation is a consistent problem for later redownloads (like e621, which appends 'preview' tags to the post url) _should_ now be caught reliably!</li>
+					<li>the 'allow subdomains' on edit url class panel is now named 'match subdomains' and has a tooltip to better explain how it works</li>
+					<li>'keep subdomains' is now 'keep matched subdomains' and has a tooltip as well</li>
+					<li>the 'keep matched subdomains' enabled behaviour (and some normalisation calculation) is now additionally governed by the 'associate url with files' value and api url conversion info rather than just 'match subdomains' and raw url type</li>
+					<li>fixed an issue that was stopping the 'associate url with files' option sticking in edit url class panel</li>
+					<li>edit url matches now resorts after an add or edit action</li>
+					<li>all listctrls with a wrapper panel now resort after an import from clipboard, png, or defaults call</li>
+					<li>url matches now match against www*. versions of their domain regardless of 'match subdomains' settings</li>
+					<li>updated xbooru url classes to prefer https</li>
+					<li>the manage url class links panel now has a 'clear' button to clear a url_class->parser link</li>
+					<li>introduced three new simple downloader parsers for yiff.party, thanks to @cuddlebear on discord for the submission</li>
+					<li>the old 'uninteresting mime' status has been expanded to a wider 'vetoed' status to represent all file imports that are abandoned without a particular error (e.g. tag blacklist, wrong filesize or resolution)</li>
+					<li>the import system now reports the total of 'num vetoed' as 'num ignored' in its summary statements</li>
+					<li>it now also reports 'num skipped'</li>
+					<li>the 'num successful' and 'num already in db' are now folded more neatly together in import cache summary statements</li>
+					<li>file downloads that are cancelled will now set a 'veto' state rather than a 'skip' state</li>
+					<li>improved file import exception handling across the board</li>
+					<li>improved how single-file-result parsing vetoes propagate up to the file import status cache</li>
+					<li>404 network errors will now provide a 'veto' status rather than an 'error'</li>
+					<li>vetoes will not count as errors when deciding whether a subscription should be abandoned early (so a bunch of decomp bombs or 404s will no longer stutter a subscription!)</li>
+					<li>misc fixes and improvements to the new download stuff</li>
+					<li>wrote a new parsing cache that saves a lot of work in the new parsing system</li>
+					<li>improved the 'is this url known?' test to better deal with situations where all the given urls are galleries or unrecognised--a better aggregate of file status is formed, and 'already in db'/'deleted' statuses will apply if there is no evidence otherwise (the dev got the new logic for this from a legit nightmare about urls downloading over and over, so let's hope it works out)</li>
+					<li>the 'is this url known?' logic also recovers from 1->n url->hash relationships where it does not expect them, trying to find 'already in db' hashes over 'deleted' ones</li>
+					<li>to clear up some ambiguity, galleries or subscriptions now give a different 'checking in x seconds' status when waiting on the first page of a query</li>
+					<li>the 'noneablebytescontrol', as seen in edit file import options, will now correctly disable/enable its bytes sub-control when it is none'ed</li>
+					<li>a persistent issue with the new network engine sometimes failing to correctly error after certain broken connections (the computer going to sleep mid-download was a common cause here) should now be recovered from and the connection naturally reattempted</li>
+					<li>added three new shortcuts to the 'main_gui' shortcut set that allow for opening a new 'urls', 'simple', or 'thread watcher' downloader page</li>
+					<li>added two more shortcuts to 'main_gui' for new 'page of pages' and 'duplicate filter page'</li>
+					<li>moved some old 'new page' menu code to the new application command system</li>
+					<li>added numerous 'duplicates' shortcuts to the 'media' shortcut set that will work on selections of thumbnails</li>
+					<li>the thumbnail duplicates menu actions now go through the new application command system</li>
+					<li>fixed an issue where the current tag parents caches was not refreshing when notified</li>
+					<li>inputting a short invalid syntactic input on a 'read' tag autocomplete such as '-' will now clear the system predicates list--system preds should now only show on a completely empty input</li>
+					<li>fixed an issue where certain combinations of 'remove a tag, then re-add it' nullipotent actions in a single manage tags dialog transaction were not applying reliably (sometimes, the subsequent mirror action was not occuring due to a processing re-order optimisation at the db level)</li>
+					<li>made some animation code a little safer and quieter as a test for some users who were getting blitzed with some deadwindow error spam in certain situations--let's see if this changes anything</li>
+					<li>replaced all the em dashes in the help with double hyphens as github pages was rendering them wrong</li>
+					<li>added CrystalDiskInfo recommendation to 'help my db is broke.txt'</li>
+					<li>misc cleanup</li>
+				</ul>
 				<li><h3>version 303</h3></li>
 				<ul>
 					<li>file post url classes can now be linked to parsers!</li>

diff --git a/help/faq.html b/help/faq.html
@@ -7,7 +7,7 @@
 	<body>
 		<div class="content">
 			<a id="repositories"><h3>what is a repository?</h3></a>
-			<p>A <i>repository</i> is a service in the hydrus network that stores a certain kind of information—files or tag mappings, for instance—as submitted by users all over the internet. Those users periodically synchronise with the repository so they know everything that it stores. Sometimes, like with tags, this means creating a complete local copy of everything on the repository. Hydrus network clients never send queries to repositories; they perform queries over their local cache of the repository's data, keeping everything confined to the same computer.</p>
+			<p>A <i>repository</i> is a service in the hydrus network that stores a certain kind of information--files or tag mappings, for instance--as submitted by users all over the internet. Those users periodically synchronise with the repository so they know everything that it stores. Sometimes, like with tags, this means creating a complete local copy of everything on the repository. Hydrus network clients never send queries to repositories; they perform queries over their local cache of the repository's data, keeping everything confined to the same computer.</p>
 			<a id="tags"><h3>what is a tag?</h3></a>
 			<p><a href="https://en.wikipedia.org/wiki/Tag_(metadata)">wiki</a></p>
 			<p>A <i>tag</i> is a small bit of text describing a single property of something. They make searching easy. Good examples are "flower" or "nicolas cage" or "the sopranos" or "2003". By combining several tags together ( e.g. [ 'tiger woods', 'sports illustrated', '2008' ] or [ 'cosplay', 'the legend of zelda' ] ), a huge image collection is reduced to a tiny and easy-to-digest sample.</p>
@@ -27,7 +27,7 @@
 				<li>A filename is not unique; did you mean this "04.jpg" or <i>this</i> "04.jpg" in another folder? Perhaps "04 (3).jpg"?</li>
 				<li>A filename is not guaranteed to describe the file correctly, e.g. hello.jpg</li>
 				<li>A filename is not guaranteed to stay the same, meaning other programs cannot rely on the filename address being valid or even returning the same data every time.</li>
-				<li>A filename is often—for <i>ridiculous</i> reasons—limited to a certain prohibitive character set. Even when utf-8 is supported, some arbitrary ascii characters are usually not, and different localisations, operating systems and formatting conventions only make it worse.</p>
+				<li>A filename is often--for <i>ridiculous</i> reasons--limited to a certain prohibitive character set. Even when utf-8 is supported, some arbitrary ascii characters are usually not, and different localisations, operating systems and formatting conventions only make it worse.</p>
 				<li>Folders can offer context, but they are clunky and time-consuming to change. If you put each chapter of a comic in a different folder, for instance, reading several volumes in one sitting can be a pain. Nesting many folders adds navigation-latency and tends to induce less informative "04.jpg"-type filenames.</li>
 			</ul>
 			<p>So, the client tracks files by their <i>hash</i>.</p>

diff --git a/help/getting_started_files.html b/help/getting_started_files.html
@@ -10,9 +10,9 @@
 			<h3 class="warning">a warning</h3>
 			<p class="warning">This is the real internet, not babby AOL. By default, absolutely nothing is shared, but if you screw around, the hydrus client gives you the power to screw up your life. If you want to do private sexy slideshows of your shy wife that's fine, but don't upload the pictures anywhere you don't absolutely trust and don't upload public tags that'll identify anyone. It is <b>impossible</b> to contain leaks of private information.</p>
 			<h3>the problem</h3>
-			<p>If you have ever seen something like this—</p>
+			<p>If you have ever seen something like this--</p>
 			<p><img src="pictures.png" title="After a while, I started just dropping everything in here unsorted. It would only grow, hungry and untouchable." /></p>
-			<p>—then you already know the problem: using a filesystem to manage a lot of images sucks.</p>
+			<p>--then you already know the problem: using a filesystem to manage a lot of images sucks.</p>
 			<p>Finding the right picture quickly can be difficult. Finding everything by a particular artist at a particular resolution is unthinkable. Integrating new files into the whole nested-folder mess is a further pain, and most operating systems bug out when displaying 10,000+ thumbnails.</p>
 			<h3>so, what does the hydrus client do?</h3>
 			<p>Let's first focus on <i>importing</i> files.</p>
@@ -40,7 +40,7 @@ <h3>so, what does the hydrus client do?</h3>
 					<p>When you have some tags in your database, typing in the text box will search them:</p>
 					<p><img src="ac_dropdown_feel.png" /></p>
 					<p>The (number) shows how many files have that tag, and hence how large the search result will be if you select that tag.</p>
-					<p>Clicking 'searching immediately' will pause the searcher, letting you add several tags in a row without sending it off to get results immediately. Ignore the other buttons for now—you will figure them out as you gain experience with the program.</p>
+					<p>Clicking 'searching immediately' will pause the searcher, letting you add several tags in a row without sending it off to get results immediately. Ignore the other buttons for now--you will figure them out as you gain experience with the program.</p>
 				</li>
 				<li>You can remove from the list of 'active tags' in the box above with a double-click, or by entering the exact same tag again through the dropdown.</li>
 				<li>Play with the system tags more if you like, and the sort-by dropdown. The collect-by dropdown is advanced, so wait until you understand <i>namespaces</i> before expecting it to do anything.</li>
@@ -76,9 +76,9 @@ <h3>inbox and archiving</h3>
 			<p>Anything you do not want to keep should be deleted by selecting from the right-click menu or by hitting the delete key. Deleted files are sent to the trash. They will get a little trash icon:</p>
 			<p><img src="processed_imports.png" /></p>
 			<p>A trashed file will not appear in subsequent normal searches, although you can search the trash specifically by clicking the 'my files' button on the autocomplete dropdown and changing the file domain to 'trash'. Undeleting a file (shift+delete) will return it to 'my files' as if nothing had happened. Files that remain in the trash will be permanently deleted, usually after a few days. You can change the permanent deletion behaviour in the client's options.</p>
-			<p>A quick way of processing new files is—</p>
+			<p>A quick way of processing new files is--</p>
 			<h3>filtering</h3>
-			<p>Lets say you just downloaded a good thread, or perhaps you just imported an old folder of miscellany. You now have a whole bunch of files in your inbox—some good, some awful. You probably want to quickly go through them, saying <i>yes, yes, yes, no, yes, no, no, yes</i>, where <i>yes</i> means 'keep and archive' and <i>no</i> means 'delete this trash'. <b>Filtering</b> is the solution.</p>
+			<p>Lets say you just downloaded a good thread, or perhaps you just imported an old folder of miscellany. You now have a whole bunch of files in your inbox--some good, some awful. You probably want to quickly go through them, saying <i>yes, yes, yes, no, yes, no, no, yes</i>, where <i>yes</i> means 'keep and archive' and <i>no</i> means 'delete this trash'. <b>Filtering</b> is the solution.</p>
 			<p>Select some thumbnails, and either choose <i>filter->archive/delete</i> from the right-click menu or hit F12. You will see them in a special version of the media viewer, with the following controls:</p>
 			<ul>
 				<li>Left-click, space, or F7: <b>keep and archive the file, move on</b></li>

diff --git a/help/getting_started_ratings.html b/help/getting_started_ratings.html
@@ -9,7 +9,7 @@
 			<p><a href="getting_started_tags.html"><--- Back to tags</a></p>
 			<p>The hydrus client supports two kinds of ratings: <i>like/dislike</i> and <i>numerical</i>. Let's start with the simpler one:</p>
 			<h3>like/dislike</h3>
-			<p>This can set one of two values to a file. It does not have to represent like or dislike—it can be anything you want. Go to <i>services->manage services->local->like/dislike ratings</i>:</p>
+			<p>This can set one of two values to a file. It does not have to represent like or dislike--it can be anything you want. Go to <i>services->manage services->local->like/dislike ratings</i>:</p>
 			<p><img src="ratings_like.png" /></p>
 			<p>You can set a variety of colours and shapes.</p>
 			<h3>numerical</h3>