# Cool Tools

## Work with Countries, Currencies, Subdivisions, and more

Do you work with international data?

You probably know how important it is to use the correct codes for countries, currencies, languages, and subdivisions.

To save the headache, try `pycountry` for Python!

`pycountry` makes it easy to work with these codes.

It allows you to look up country and currency information by name or code based on ISO.

But it can also be used to get the name or code for a specific currency or country.

In [None]:
!pip install pycountry

In [None]:
import pycountry

# Get Country
print(pycountry.countries.get(alpha_2="DE"))

# Get Currency
print(pycountry.currencies.get(alpha_3="EUR"))

# Get Language
print(pycountry.languages.get(alpha_2='DE'))

## Generate better requirements files with `pipreqs`

To generate a requirements.txt file, don’t do pip freeze > requirements.txt

It will save all packages in your environment including those you are not currently using in your project (but still have installed).

Instead, use `pipreqs`.

`pipreqs` will only save those packages based on imports in your project.

A very good option for plain virtual environments.

In [None]:
!pip install pipreqs

In [None]:
!pipreqs .

## Remove a package and its dependencies with `pip-autoremove`

When you want to remove a package via pip, you will encounter following problem:

pip will remove the desired package but not its unused dependencies.

Instead, try `pip-autoremove`.

It will automatically remove a package and its unused dependencies. 

A really good option when you are not using something like Poetry.

In [None]:
!pip install pip-autoremove

In [None]:
!pip-autoremove flask -y

## Get distance between postal codes

Do you want the distance between two postal codes?

Use `pgeocode`.

Just specify your country + postal codes and get the distance in KM.

In [None]:
!pip install pgeocode

In [None]:
import pgeocode

dist = pgeocode.GeoDistance('DE')
dist.query_postal_code('10117', '80331')

## Working with units with `pint`

Have you ever struggled with units in Python?

With `pint`, you don’t have to.

`pint` is a Python library for easy unit conversion and manipulation.

You can handle physical quantities with units, perform conversions, and perform arithmetic with physical quantities.

With `pint`, you keep track of your units and ensure accurate results.

In [None]:
!pip install pint

In [None]:
import pint

# Initializing the unit registry
ureg = pint.UnitRegistry()

# Defining a physical quantity with units
distance = 33.0 * ureg.kilometers
print(distance)
# 33.0 kilometer

# Converting between units
print(distance.to(ureg.feet))
# 108267.71653543308 foot

# Performing arithmetic operations
speed = 6 * distance / ureg.hour
print(speed)
# 198.0 kilometer / hour

## Supercharge your Python profiling with `Scalene`

Want to identify Python performance issues?

Try `Scalene`, your Profiler on steroids!

`Scalene` is a Python CPU + GPU + Memory profiler to identify bottlenecks.

Even with AI-powered optimization proposals!

`Scalene` comes with an easy-to-use CLI and web-based GUI.

In [None]:
!pip install scalene

In [None]:
!scalene <my_module.py>

## Fix unicode errors with `ftfy`

Have you ever struggled with Unicode errors in your Python code?

Try `ftfy`!

`ftfy` repairs scrambled text which occurs as a result of encoding or decoding problems. 

You will probably know it when text in a foreign language can’t appear correctly.

In Python you only have to call one method from `ftfy` to fix it.

In [None]:
!pip install ftfy

In [None]:
import ftfy

print(ftfy.fix_text('What does â€œftfyâ€\x9d mean?'))
print(ftfy.fix_text('âœ” Check'))
print(ftfy.fix_text('The Mona Lisa doesnÃƒÂ¢Ã¢â€šÂ¬Ã¢â€žÂ¢t have eyebrows.'))

## Remove the background from images with `rembg`

Do you want to remove the background from images with Python?

Use `rembg`.

With its pre-trained models, `rembg` makes removing the background of your images easy.

In [None]:
!pip install rembg

In [None]:
from rembg import remove
import cv2

input_path = 'car.jpg'
output_path = 'car2.jpg'

input_file = cv2.imread(input_path)
output_file = remove(input_file)
cv2.imwrite(output_path, output_file)

## Build modern CLI apps with `typer`

Tired of building clunky CLI for your Python applications?

Try `typer`.

`typer` makes it easy to create clean, intuitive CLI apps that are easy to use and maintain.

It also comes with auto-generated help messages.

Ditch argparse.

In [None]:
!pip install typer

In [None]:
# hello_script.py
import typer

app = typer.Typer()

@app.command()
def hello(name: str):
    typer.echo(f"Hello, {name}!")
    
@app.command()
def bye(name: str):
    typer.echo(f"Bye, {name}!")

if __name__ == "__main__":
    app()

In [None]:
!python hello_script.py hello John

## Generate realistic fake data with `faker`

Creating realistic test data for your Python projects is annoying.

`faker` helps you to do that!

With just a few lines of code, you can generate realistic and diverse test data, such as :

- Names
- Addresses
- Phone numbers
- Email addresses
- Jobs

And more!

You can even set the local or language for more diverse output.

In [None]:
!pip install faker

In [None]:
from faker import Faker
fake = Faker('fr_FR')
print(fake.name())
print(fake.job())
print(fake.phone_number())

## Enrich your progress bars with `rich`

Do you want a more colorful output for progress bars?

Use `rich`

`rich` offers a beautiful progress bar, instead of tqdm’s boring output.

With `rich.progress.track`, you can get a colorful output.

In [None]:
!pip install rich

In [None]:
from rich.progress import track
for url in track(range(25000000)):
    # Do something
    pass

## Set the description for TQDM bars

When you work with progress bars, you will probably use 𝐭𝐪𝐝𝐦.

Do you know you can add descriptions to your bar?

You can do that with `set_description()`.

In [None]:
import tqdm
import glob

files = tqdm.tqdm(glob.glob("sample_data/*.csv"))
for file in files:
    files.set_description(f"Read {file}")

## Convert Emojis to Text with `emot`

Analyzing emojis and emoticons in texts can give you useful insights.

With `emot`, you can convert emoticons into words.

Especially useful for sentiment analysis.

In [None]:
!pip install emot

In [None]:
import emot 
emot_obj = emot.core.emot()
text = "I love python ☮ 🙂 ❤ :-) :-( :-)))" 
emot_obj.emoji(text) 

## Print hardware information and version numbers

When raising an issue, you should provide version numbers and hardware information.

With `watermark`, you can do that easily.

Just install the package and print.

In [None]:
!pip install watermark

In [None]:
from watermark import watermark
print(watermark())

## Cache requests with `requests-cache`

Do you want better performance for requests?



Use `requests-cache`.



It caches HTTP requests so you don’t have to make the same requests again and again.



In the example below, a test endpoint with a 1-second delay will be called.



With the standard `requests` library, this takes 60 seconds.



With `requests-cache`, this takes 1 second.

In [None]:
!pip install requests-cache

In [None]:
# This takes 60 seconds
import requests

session = requests.Session()
for i in range(60):
    session.get('https://httpbin.org/delay/1')
    
    

# This takes 1 second
import requests_cache

session = requests_cache.CachedSession('test_cache')
for i in range(60):
    session.get('https://httpbin.org/delay/1')

## Unify messy columns with `unifyname`

Do you want to unify messy string columns?



Try `unifyname`, based on fuzzy string matching.



This small library cleans up your messy columns with 100s of different variations for one word.

In [None]:
!pip install unifyname

In [None]:
import pandas as pd
from unifyname.utils import unify_names, deduplicate_list_string

data = pd.read_csv("")

data["BAIRRO DO IMOVEL"].value_counts()

data = unify_names(data,column='BAIRRO DO IMOVEL',threshold_count=500)

data["BAIRRO DO IMOVEL"].value_counts()

## Check for broken links in a website

`𝐥𝐢𝐧𝐤𝐜𝐡𝐞𝐜𝐤𝐞𝐫` is a Python library for recursively going through a website and checking for broken links.

You may not have the time to do that manually.

And broken links can harm your Search Engine Ranking.

See below how easy it can be to set up and use.


In [None]:
!pip install linkchecker

In [None]:
!linkchecker https://www.example.com

## Matplotlib for your Terminal

`bashplotlib` is a little library that displays basic ASCII graphs in your terminal.

It provides a quick way to visualize your data.

Currently, `bashplotlib` only supports histogram and scatter plots.

In [None]:
!pip install bashplotlib

In [None]:
!hist --file test.txt

## Display a Dependency Tree of your Environment

Do you want to stop resolving dependency issues?

Try `pipdeptree`.

`pipdeptree` displays your installed Python packages in the form of a dependency tree.

It will also show you warnings when there are possible version conflicts.

An alternative to tools like Poetry which resolves dependency issues for you automatically.

In [None]:
!pip install pipdeptree

In [None]:
!pipdeptree

## Sort LaTeX acronyms automatically

I wrote a small library (`acrosort-tex`) to sort LaTeX acronyms with one command automatically.

It was a fun Sunday project where I really learned how easy it is to publish a package with Poetry.

Currently, it only supports acronyms in the following format:

\𝒂𝒄𝒓𝒐{𝒂𝒃𝒃𝒓𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏}[𝒔𝒉𝒐𝒓𝒕𝒇𝒐𝒓𝒎]{𝒍𝒐𝒏𝒈𝒇𝒐𝒓𝒎}

but it's a beginning :)

See below for a small example.

Link to the repository: https://lnkd.in/eTF8qs5w

In [None]:
!pip install acrosort_tex

In [None]:
!acrosort old.tex new.tex

## Make ASCII Art from Text

Create ASCII Art From Text in your Terminal

With `pyfiglet`, you can generate banner-like text with Python.

This is a nice feature to introduce your users to your Python CLI apps.

In [None]:
!pip install pyfiglet

In [None]:
# Default font
ascii_art = pyfiglet.figlet_format('Hello, world!')

# Alphabet font
ascii_art = pyfiglet.figlet_format('Hello, world!', font='Alphabet')

# Bubblehead font
ascii_art = pyfiglet.figlet_format('Hello, world!', font='bulbhead')

## Display NER with `spacy`

If you want to perform and visualize Named-entity Recognition, use `spacy.displacy`.

It makes NER and visualizing detected entities super easy.

`displacy` has some other cool tools like visualizing dependencies within a sentence or visualizing spans, so check it out.

In [None]:
import spacy
from spacy import displacy

text = "Chelsea Football Club is an English professional football club based in Fulham, West London.\
        Founded in 1905, they play their home games at Stamford Bridge. \
        The club competes in the Premier League, the top division of English football. \
        They won their first major honour, the League championship, in 1955."

nlp = spacy.load("en_core_web_sm")
doc = nlp(text)
displacy.render(doc, style="ent", jupyter=True)

## Create TikZ pictures with Python

If you have ever written a paper in LaTeX, you probably used TikZ for your graphics.

TikZ is probably the most powerful tool to create graphic elements.

And notoriously hard to learn.

No need to worry, you can create TikZ-figures in Python too.

With `tikzplotlib`, you can convert matplotlib figures into TikZ.

You can then insert the resulting plot in your LaTeX file.

Really useful when you don’t want the hassle with TikZ.

In [None]:
!pip install tikzplotlib

In [None]:
import tikzplotlib
import matplotlib.pyplot as plt
import numpy as np

plt.style.use("ggplot")

t = np.arange(0.0, 2.0, 0.1)
s = np.sin(2 * np.pi * t)
s2 = np.cos(2 * np.pi * t)
plt.plot(t, s, "o-", lw=4.1)
plt.plot(t, s2, "o-", lw=4.1)
plt.xlabel("time (s)")
plt.ylabel("Voltage (mV)")
plt.title("Simple plot $\\frac{\\alpha}{2}$")
plt.grid(True)

tikzplotlib.save("mytikz.tex")

## Human-readable RegEx with `PRegEx`

RegEx is notoriously nasty to read and write.

For a human-readable alternative, try `PRegEx`.

`PRegEx` is a Python library aiming to have an easy-to-remember syntax to write RegEx patterns.

It offers a way to easily break down a complex pattern into multiple simpler ones that can then be combined. 

See below how we can write a pattern that matches any URL that ends with either “.com” or “.org” as well as any IP address for which a 4-digit port number is specified.

In [None]:
from pregex.core.classes import AnyLetter, AnyDigit, AnyFrom
from pregex.core.quantifiers import Optional, AtLeastAtMost
from pregex.core.operators import Either
from pregex.core.groups import Capture
from pregex.core.pre import Pregex

http_protocol = Optional('http' + Optional('s') + '://')

www = Optional('www.')

alphanum = AnyLetter() | AnyDigit()

domain_name = \
  alphanum + \
  AtLeastAtMost(alphanum | AnyFrom('-', '.'), n=1, m=61) + \
  alphanum

tld = '.' + Either('com', 'org')

ip_octet = AnyDigit().at_least_at_most(n=1, m=3)

port_number = (AnyDigit() - '0') + 3 * AnyDigit()

# Combine sub-patterns together.
pre: Pregex = \
    http_protocol + \
    Either(
        www + Capture(domain_name) + tld,
        3 * (ip_octet + '.') + ip_octet + ':' + port_number
    )