Sourcery refactored master branch #3

sourcery-ai · 2022-03-14T12:13:25Z

Branch master refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the master branch, then run:

git fetch origin sourcery/master
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

py3langid/langid.py

sourcery-ai · 2022-03-14T12:13:28Z

py3langid/examples/_twokenize.py

-urlExtraCrapBeforeEnd = regex_or(punctChars, entity) + "+?"
+urlExtraCrapBeforeEnd = f'{regex_or(punctChars, entity)}+?'


Lines 58-195 refactored with the following changes:

Use f-string instead of string concatenation (use-fstring-for-concatenation)

This removes the following comments ( why? ):

# iOS 'emoji' characters (some smileys, some symbols) [\ue001-\uebbb] # Standard version :) :( :] :D :P # myleott: o.O and O.o are two of the biggest sources of differences # reversed version (: D: use positive lookbehind to remove "(word):" # TODO should try a big precompiled lexicon from Wikipedia, Dan Ramage told me (BTO) he does this # between this and the Java version. One little hack won't hurt... #inspired by http://en.wikipedia.org/wiki/User:Scapler/emoticons#East_Asian_style # because eyes on the right side is more ambiguous with the standard usage of : ;

py3langid/train/NBtrain.py

sourcery-ai · 2022-03-14T12:13:29Z

py3langid/train/common.py

-            f = lambda fn, chunks: pool.imap_unordered(fn, chunks, chunksize=chunksize)
-            yield f
+            yield lambda fn, chunks: pool.imap_unordered(fn, chunks, chunksize=chunksize)
    else:
        if initializer is not None:
            initializer(*initargs)
-        f = imap
-        yield f
-
+        yield imap


Function MapPool refactored with the following changes:

Inline variable that is immediately yielded (inline-immediately-yielded-variable)

sourcery-ai · 2022-03-14T12:13:29Z

py3langid/train/index.py

-                for docname in filenames:
-                    candidates.append(os.path.join(dirpath, docname))
+                candidates.extend(os.path.join(dirpath, docname) for docname in filenames)


Function CorpusIndexer.__init__ refactored with the following changes:

Replace a for append loop with list extend (for-append-to-extend)

sourcery-ai · 2022-03-14T12:13:29Z

py3langid/train/index.py

-        reject_langs = {
-            l
-            for l in lang_domain_count if lang_domain_count[l] < min_domain
-        }
-
-        # Remove the languages from the indexer
-        if reject_langs:
+        if reject_langs := {
+            l for l in lang_domain_count if lang_domain_count[l] < min_domain
+        }:


Function CorpusIndexer.prune_min_domain refactored with the following changes:

Use named expression to simplify assignment and conditional (use-named-expression)

This removes the following comments ( why? ):

# Remove the languages from the indexer

py3langid/train/tokenize.py

sourcery-ai · 2022-03-14T12:22:51Z

Sourcery Code Quality Report

✅ Merging this PR will increase code quality in the affected files by 1.14%.

Quality metrics	Before	After	Change
Complexity	15.48 🙂	18.77 😞	3.29 👎
Method Length	59.99 ⭐	52.08 ⭐	-7.91 👍
Working memory	10.82 😞	10.28 😞	-0.54 👍
Quality	56.72% 🙂	57.86% 🙂	1.14% 👍

Other metrics	Before	After	Change
Lines	1695	914	-781

Changed files	Quality Before	Quality After	Quality Change
py3langid/langid.py	49.48% 😞	49.48% 😞	0.00%
py3langid/tools/printfeats.py	94.84% ⭐	96.01% ⭐	1.17% 👍
py3langid/train/common.py	80.33% ⭐	81.22% ⭐	0.89% 👍
py3langid/train/index.py	63.10% 🙂	63.28% 🙂	0.18% 👍

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation
py3langid/langid.py	main	65 ⛔	623 ⛔	17 ⛔	7.88% ⛔	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
py3langid/langid.py	application	21 😞	223 ⛔	15 😞	30.34% 😞	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
py3langid/train/index.py	CorpusIndexer.__init__	10 🙂	146 😞	12 😞	50.01% 🙂	Try splitting into smaller methods. Extract out complex expressions
py3langid/train/index.py	CorpusIndexer.prune_min_domain	9 🙂	104 🙂	11 😞	57.78% 🙂	Extract out complex expressions
py3langid/langid.py	LanguageIdentifier.set_languages	7 ⭐	88 🙂	12 😞	60.16% 🙂	Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

'Refactored by Sourcery'

9f418dd

sourcery-ai bot requested a review from adbar March 14, 2022 12:13

sourcery-ai bot commented Mar 14, 2022

View reviewed changes

adbar added 4 commits March 14, 2022 13:20

delete file

3b172f2

delete file

b7dbef7

remove file from PR

ee79bca

remove file from PR

e9044ca

adbar closed this Mar 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sourcery refactored master branch #3

Sourcery refactored master branch #3

sourcery-ai bot commented Mar 14, 2022

sourcery-ai bot Mar 14, 2022

sourcery-ai bot Mar 14, 2022

sourcery-ai bot Mar 14, 2022

sourcery-ai bot Mar 14, 2022

sourcery-ai bot commented Mar 14, 2022

		urlExtraCrapBeforeEnd = regex_or(punctChars, entity) + "+?"
		urlExtraCrapBeforeEnd = f'{regex_or(punctChars, entity)}+?'

Sourcery refactored master branch #3

Sourcery refactored master branch #3

Conversation

sourcery-ai bot commented Mar 14, 2022

sourcery-ai bot Mar 14, 2022

Choose a reason for hiding this comment

sourcery-ai bot Mar 14, 2022

Choose a reason for hiding this comment

sourcery-ai bot Mar 14, 2022

Choose a reason for hiding this comment

sourcery-ai bot Mar 14, 2022

Choose a reason for hiding this comment

sourcery-ai bot commented Mar 14, 2022

Sourcery Code Quality Report

Legend and Explanation