m2ools

Many implementations of functionalities such as cache and retry already exist, therefore the "me too" on these.

Description

Main use of these tools is to aid in common web scraping tasks.

jitter and jitterargs

jitter decorates a function to jitter its numeric return value in a particular way (see 'example_jitter.py')
jitterargs decorates a function to jitter its numeric arguments in a particular way (see 'example_jitter.py')

Most commonly used with jitter factors set to 1 when used for timing delays to assure the jittered value is non-negative. For other purposes one can choose to jitter more or less aggressively:

Well documented in code.

retry

Decorates a function to keep calling it until a satisfying return value is obtained. For example retrying a request for a defined maximum number of times with powerful configuration options for variable delays between consecutive calls and a custom function to validate the return value. Well documented in code. See 'example_retry.py'.

cache

Decorates a function to cache its results to disk. Supports hoarding of results. Useful when developing a web scraper and not wanting to send the same request again and again on each new run. Or simply when hoarding is desired to maintain a history of results as a time series. Supports a flexible format for specifying maximum age of cached results before they go stale. See 'example_cache.py'.

TBD: Would benefit from better documenting this part of code.

Usage

Best illustrated in example files. To preemptively avoid conflicts it is recommended to import m2ools as m2 and then use @m2.jitter, @m2.retry(reachback='1 hour')and so forth.

Requirements

Pandas are used only to enable caching of pandas.DataFrame to a csv format. Numpy and matplotlib are used only in example files.

Authors

Marko Dukši @LinkedIn

Version History

0.1
- Initial Release

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_cache.py		example_cache.py
example_jitter.py		example_jitter.py
example_retry.py		example_retry.py
m2ools.py		m2ools.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

m2ools

Description

jitter and jitterargs

retry

cache

Usage

Requirements

Authors

Version History

License

About

Releases

Packages

Languages

License

MarkoDuksi/m2ools

Folders and files

Latest commit

History

Repository files navigation

m2ools

Description

jitter and jitterargs

retry

cache

Usage

Requirements

Authors

Version History

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages