CV_POM

Introduction

CV POM framework provides tools to detect elements in image content and interact with them.

CV POM

CV POM converts any image into a page object model. This model lets you access the elements recognized in the image. Elements contain such properties as labels, coordinates and others. It's also possible to transform the elements into a JSON representation for easy integration with other tools.

CV POM Driver

CV POM Driver is built on top of CV POM and provides easy integration with any automation framework (like Selenium or Appium). The user just needs to overwrite a couple of methods of the CVPOMDriver class and then use it as a driver to find elements and interact with them.

Since this approach doesn't require any APIs from the application to test, it is generic for every platform/app combination, allowing the user to automate for each platform with the same APIs. It also allows the automation of workflows based on the UI representation, which validates the stylings and placement of each of the elements, which is something that most UI automation frameworks lack.

Install

pip install cv_pom

CV_POM usage

CVPOMDriver usage

First, overwrite two methods of CVPOMDriver

from cv_pom.cv_pom_driver import CVPOMDriver


class MyCVPOMDriver(CVPOMDriver):
    def __init__(self, model_path: str | Path, your_driver) -> None:
        super().__init__(model_path)
        self._driver = your_driver  # Store your driver so that you can use it later

    def _get_screenshot(self) -> ndarray:
        """Add the code that takes a screenshot"""

    def _click_coordinates(self, x: int, y: int):
        """Add the code that clicks on the (x,y) coordinates"""

Then use it for automation

framework_specific_driver = ... # Driver object you create with your automation framework of choice
model_path = "./my-model.pt"
cv_pom_driver = MyCVPOMDriver(model_path, framework_specific_driver)

# Find element by label
element = cv_pom_driver.find_element({"label": "reply-main"})
# Click on it
element.click()
# Wait until invisible
element.wait_invisible()
# Methods are also chainable
cv_pom_driver.find_element({"text": "some text"}).click()
# Get all elements to process them manually
cv_pom_driver.find_elements(None)
# Swipe/Scroll by coordinates coords=(x, y, x_end, y_end)
cv_pom_driver.swipe(coords=(10, 10, 400, 400))
# Swipe/Scroll by element
cv_pom_driver.find_element({"label": "reply-main"}).swipe(el=cv_pom_driver.find_element({"label": "rally"}))
# Swipe/Scroll by direction "up", "down", "left" and "right"
cv_pom_driver.find_element({"label": "reply-main"}).swipe(direction="down")

For more info about the query syntax, look into the documentation of POM.get_elements() method (cv_sdk/cv_pom.py).

CVPOM usage

See tests or CVPOMDriver implementation for examples of how to use the underlying CVPOM class.

As CLI

You can also inspect the elements in images by using the main.py script

python main.py --model test/resources/best_august.pt --media test/resources/yolo_test_1.png

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
cv_pom		cv_pom
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

cv_pom

cv_pom

test

test

.gitignore

.gitignore

CHANGELOG.md

CHANGELOG.md

LICENSE

LICENSE

README.md

README.md

main.py

main.py

pyproject.toml

pyproject.toml

requirements.txt

requirements.txt

server.py

server.py

Repository files navigation

CV_POM

Introduction

CV POM

CV POM Driver

Install

CV_POM usage

CVPOMDriver usage

CVPOM usage

As CLI

About

Releases 1

Packages

Languages

License

testdevlab/cv_pom

Folders and files

Latest commit

History

Repository files navigation

CV_POM

Introduction

CV POM

CV POM Driver

Install

CV_POM usage

CVPOMDriver usage

CVPOM usage

As CLI

About

Resources

License

Stars

Watchers

Forks

Languages