# Package Auto Assembler

This tool is meant to streamline creation of single module packages.
Its purpose is to automate as many aspects of python package creation as possible,
to shorten a development cycle of reusable components, maintain certain standard of quality
for reusable code. It provides tool to simplify the process of package creatrion
to a point that it can be triggered automatically within ci/cd pipelines,
with minimal preparations and requirements for new modules.


In [1]:
import sys
sys.path.append('../')
from python_modules.package_auto_assembler import (VersionHandler, \
    ImportMappingHandler, RequirementsHandler, MetadataHandler, \
        LocalDependaciesHandler, LongDocHandler, SetupDirHandler, \
            PackageAutoAssembler)

## Usage examples

The examples contain: 
1. package versioning
2. import mapping
3. extracting and merging requirements
4. preparing metadata
5. merging local dependacies into single module
6. assembling setup directory
7. making a package

### 1. Package versioning

#### Initialize VersionHandler

In [2]:
pv = VersionHandler(
    # required
    versions_filepath = '../tests/package_auto_assembler/lsts_package_versions.yml',
    log_filepath = '../tests/package_auto_assembler/version_logs.csv',
    # optional 
    default_version = "0.0.1")

#### Add new package

In [3]:
pv.add_package(
    package_name = "new_package", 
    # optional
    version = "0.0.1"
    )

#### Update package version

In [4]:
pv.increment_patch(
    package_name = "new_package"
)
## for not tracked package
pv.increment_patch(
    package_name = "another_new_package",
    # optional
    default_version = "0.0.1"
)

There are no known versions of 'another_new_package', 0.0.1 will be used!


#### Display current versions and logs

In [5]:
pv.get_versions(
    # optional
    versions_filepath = 'lsts_package_versions.yml'
)

{'another_new_package': '0.0.1', 'new_package': '0.0.2'}

In [6]:
pv.get_version(
    package_name='new_package'
)

'0.0.2'

In [7]:
pv.get_logs(
    # optional
    log_filepath = 'version_logs.csv'
)

Unnamed: 0,Timestamp,Package,Version
0,2024-01-01 21:09:37,new_package,0.0.1
1,2024-01-01 21:09:39,new_package,0.0.2
2,2024-01-01 21:09:39,another_new_package,0.0.1


#### Flush versions and logs

In [8]:
pv.flush_versions()
pv.flush_logs()

### 2. Import mapping

#### Initialize ImportMappingHandler

In [9]:
im = ImportMappingHandler(
    # required
    mapping_filepath = "../env_spec/package_mapping.json"
)

#### Load package mappings

In [10]:
im.load_package_mappings(
    # optional
    mapping_filepath = "../env_spec/package_mapping.json"
)

{'PIL': 'Pillow',
 'bs4': 'beautifulsoup4',
 'fitz': 'PyMuPDF',
 'attr': 'attrs',
 'dotenv': 'python-dotenv',
 'googleapiclient': 'google-api-python-client',
 'sentence_transformers': 'sentence-transformers',
 'flask': 'Flask',
 'stdlib_list': 'stdlib-list',
 'sklearn': 'scikit-learn',
 'yaml': 'pyyaml'}

### 3. Extracting and merging requirements

#### Initialize RequirementsHandler

In [12]:
rh = RequirementsHandler(
    # optional/required later
    module_filepath = "../tests/package_auto_assembler/mock_vector_database.py",
    package_mappings = {'PIL': 'Pillow',
                        'bs4': 'beautifulsoup4',
                        'fitz': 'PyMuPDF',
                        'attr': 'attrs',
                        'dotenv': 'python-dotenv',
                        'googleapiclient': 'google-api-python-client',
                        'sentence_transformers': 'sentence-transformers',
                        'flask': 'Flask',
                        'stdlib_list': 'stdlib-list',
                        'sklearn': 'scikit-learn',
                        'yaml': 'pyyaml'},
    requirements_output_path = "../tests/package_auto_assembler/",
    output_requirements_prefix = "requirements_",
    custom_modules_filepath = "../tests/package_auto_assembler/dependancies",
    python_version = '3.8'
)

#### List custom modules for a given directory

In [13]:
rh.list_custom_modules(
    # optional
    custom_modules_filepath="../tests/package_auto_assembler/dependancies")

['comparisonframe', 'shouter']

#### Check if module is a standard python library

In [14]:
rh.is_standard_library(
    # required
    module_name = 'shouter',
    # optional
    python_version = '3.8'
    )

False

#### Extract requirements from the module file

In [15]:
rh.extract_requirements(
    # optional
    module_filepath = "../tests/package_auto_assembler/mock_vector_database.py",
    custom_modules = ['comparisonframe', 'shouter'],
    package_mappings = {'PIL': 'Pillow',
                        'bs4': 'beautifulsoup4',
                        'fitz': 'PyMuPDF',
                        'attr': 'attrs',
                        'dotenv': 'python-dotenv',
                        'googleapiclient': 'google-api-python-client',
                        'sentence_transformers': 'sentence-transformers',
                        'flask': 'Flask',
                        'stdlib_list': 'stdlib-list',
                        'sklearn': 'scikit-learn',
                        'yaml': 'pyyaml'},
    python_version = '3.8'
)

['### mock_vector_database.py',
 'numpy',
 'dill==0.3.7',
 'attrs>=22.2.0',
 'requests==2.31.0',
 'hnswlib==0.7.0',
 'sentence-transformers==2.2.2']

#### Save requirements to a file

In [16]:
rh.write_requirements_file(
    # optional/required later
    module_name = 'mock_vector_database',
    requirements = ['### mock_vector_database.py',
                    'numpy',
                    'dill==0.3.7',
                    'attrs>=22.2.0',
                    'requests==2.31.0',
                    'hnswlib==0.7.0',
                    'sentence-transformers==2.2.2'],
    output_path = "../tests/package_auto_assembler/",
    prefix = "requirements_"
)

#### Read requirements

In [17]:
rh.read_requirements_file(
    # required
    requirements_filepath = "../tests/package_auto_assembler/requirements_mock_vector_database.txt"
)

['numpy',
 'dill==0.3.7',
 'attrs>=22.2.0',
 'requests==2.31.0',
 'hnswlib==0.7.0',
 'sentence-transformers==2.2.2']

### 4. Preparing metadata

#### Initializing MetadataHandler

In [2]:
mh = MetadataHandler(
    # optional/required later
    module_filepath = "../tests/package_auto_assembler/mock_vector_database.py"
    
)

#### Check if metadata is available

In [7]:
mh.is_metadata_available(
    # optional 
    module_filepath = "../tests/package_auto_assembler/mock_vector_database.py"
)

True

#### Extract metadata from module

In [8]:
mh.get_package_metadata(
    # optional 
    module_filepath = "../tests/package_auto_assembler/mock_vector_database.py"
)

{'author': 'Kyrylo Mordan',
 'author_email': 'parachute.repo@gmail.com',
 'version': '0.0.1',
 'description': 'A mock handler for simulating a vector database.',
 'keywords': ['python', 'vector database', 'similarity search']}

### 5. Merging local dependacies into single module

#### Initializing LocalDependaciesHandler

In [8]:
ldh = LocalDependaciesHandler(
    # required
    main_module_filepath = "../tests/package_auto_assembler/mock_vector_database.py",
    dependencies_dir = "../tests/package_auto_assembler/dependancies/"
)

#### Combine main module with dependacies

In [12]:
print(ldh.combine_modules(
    # optional
    main_module_filepath = "../tests/package_auto_assembler/mock_vector_database.py",
    dependencies_dir = "../tests/package_auto_assembler/dependancies/"
)[0:1000])

"""
Mock Vector Db Handler

This class is a mock handler for simulating a vector database, designed primarily for testing and development scenarios.
It offers functionalities such as text embedding, hierarchical navigable small world (HNSW) search,
and basic data management within a simulated environment resembling a vector database.
"""

import logging
import json
import time
import numpy as np #==1.26.0
import dill #==0.3.7
import attr #>=22.2.0
import requests #==2.31.0
import hnswlib #==0.7.0
from sentence_transformers import SentenceTransformer #==2.2.2
import sklearn
import string
import os
import csv
from collections import Counter
from datetime import datetime #==5.2
import dill #==5.0.1
import pandas as pd #==2.1.1
from sklearn.metrics.pairwise import cosine_similarity #==1.3.1

@attr.s
class Shouter:

    """
    A class for managing and displaying formatted log messages.

    This class uses the logging module to create and manage a logger
    for displaying formatted messag

#### Save combined module

In [4]:
ldh.save_combined_modules(
    # optional
    combined_module = ldh.combine_modules(),
    save_filepath = "./combined_mock_vector_database.py"
)

### 6. Prepare README

In [2]:
import logging
ldh = LongDocHandler(
    # optional/required later
    notebook_path = "./mock_vector_database.ipynb",
    markdown_filepath = "../mock_vector_database.md",
    timeout = 600,
    # logger
    loggerLvl = logging.DEBUG
)

#### Convert notebook to md without executing

In [3]:
ldh.convert_notebook_to_md(
    # optional
    notebook_path = "./mock_vector_database.ipynb",
    output_path = "../mock_vector_database.md"
)

Converted ./mock_vector_database.ipynb to ../mock_vector_database.md


#### Convert notebook to md with executing

In [4]:
ldh.convert_and_execute_notebook_to_md(
    # optional
    notebook_path = "./mock_vector_database.ipynb",
    output_path = "../mock_vector_database.md",
    timeout = 600
)

Using selector: KqueueSelector
Using selector: KqueueSelector
Converted and executed ./mock_vector_database.ipynb to ../mock_vector_database.md


#### Return long description

In [6]:
long_description = ldh.return_long_description(
    # optional
    markdown_filepath = "../mock_vector_database.md"
)

### 7. Assembling setup directory

#### Initializing SetupDirHandler

In [9]:
sdh = SetupDirHandler(
    # required
    module_filepath = "../tests/package_auto_assembler/mock_vector_database.py",
    # optional/ required
    module_name = "mock_vector_database",
    metadata = {'author': 'Kyrylo Mordan',
                'author_email': 'parachute.repo@gmail.com',
                'version': '0.0.1',
                'description': 'A mock handler for simulating a vector database.',
                'long_description' : long_description,
                'keywords': ['python', 'vector database', 'similarity search']},
    requirements = ['numpy',
                    'dill==0.3.7',
                    'attrs>=22.2.0',
                    'requests==2.31.0',
                    'hnswlib==0.7.0',
                    'sentence-transformers==2.2.2'],
    classifiers = ['Development Status :: 3 - Alpha', 
                   'Intended Audience :: Developers', 
                   'Intended Audience :: Science/Research', 
                   'Programming Language :: Python :: 3', 
                   'Programming Language :: Python :: 3.9', 
                   'Programming Language :: Python :: 3.10', 
                   'Programming Language :: Python :: 3.11', 
                   'License :: OSI Approved :: MIT License', 
                   'Topic :: Scientific/Engineering'],
    setup_directory = "./example_setup_dir"
    
)

#### Create empty setup dir

In [10]:
sdh.flush_n_make_setup_dir(
    # optional
    setup_directory = "./example_setup_dir"
)

#### Copy module to setup dir

In [11]:
sdh.copy_module_to_setup_dir(
    # optional
    module_filepath = "./combined_mock_vector_database.py",
    setup_directory = "./example_setup_dir"
)

#### Create init file

In [12]:
sdh.create_init_file(
    # optional
    module_name = "mock_vector_database",
    setup_directory = "./example_setup_dir"
)

#### Create setup file

In [13]:
sdh.write_setup_file(
    # optional
    module_name = "mock_vector_database",
    metadata = {'author': 'Kyrylo Mordan',
                'author_email': 'parachute.repo@gmail.com',
                'version': '0.0.1',
                'description': 'A mock handler for simulating a vector database.',
                'keywords': ['python', 'vector database', 'similarity search']},
    requirements = ['numpy',
                    'dill==0.3.7',
                    'attrs>=22.2.0',
                    'requests==2.31.0',
                    'hnswlib==0.7.0',
                    'sentence-transformers==2.2.2'],
    classifiers = ['Development Status :: 3 - Alpha', 
                   'Intended Audience :: Developers', 
                   'Intended Audience :: Science/Research', 
                   'Programming Language :: Python :: 3', 
                   'Programming Language :: Python :: 3.9', 
                   'Programming Language :: Python :: 3.10', 
                   'Programming Language :: Python :: 3.11', 
                   'License :: OSI Approved :: MIT License', 
                   'Topic :: Scientific/Engineering'],
    setup_directory = "./example_setup_dir"
)