Skip to content
View Noblesite's full-sized avatar

Block or report Noblesite

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Noblesite/README.md

πŸ‘‹ Hi, I'm Jonathon Poe (a.k.a. Noblesite)

Mobile Solutions Architect & Senior Software Engineer with 15+ years of experience designing scalable enterprise mobility tools, automation platforms, and LLM-based assistant systems.

I specialize in:

  • πŸ” Mobile Security (MDM/MAM – Intune, Workspace ONE, Knox)
  • πŸ“± Native iOS & Android development
  • πŸ”„ Enterprise automation (API integration, workflows, device diagnostics)
  • 🧠 LLM-based tooling (QLoRA, LangChain, custom vector pipelines)

⚠️ Actively publishing refactored tools from my enterprise portfolio. Follow along as I surface, modernize, and document years of technical work.


πŸš€ Featured Project

E.M.A. Application

E.M.A. is a private research project focused on building scalable, AI-driven automation tools for enterprise platforms that exposes REST API's. It combines distributed processing, LLMs, and custom pipelines to create workflows for enterprise data orchestration, QA generation, and model training.

⚠️ This project is under active development and currently private.


Features

  • QA Pair Generation:
    • Generates high-quality question-answer pairs for domain-specific datasets.
    • Supports multiple question types: fact-based, why/how, multiple-choice, fill-in-the-blank, and true/false.
  • ChromaDB Integration:
    • Enables retrieval-augmented generation (RAG) by storing and retrieving embeddings.
  • Fine-Tuning Ready:
    • Supports fine-tuning large language models (e.g., LLaMA) with QLoRA for domain-specific tasks.
  • Distributed Processing:
    • Scalable workload distribution using Ray for parallel processing.
  • Validation and Cleaning:
    • Ensures generated QA pairs meet quality and consistency standards.
  • Centralized Configuration:
    • YAML files for managing global and component-specific settings.
  • Expansion Layer:
    • Multi-model QA generation and NER workflows.
    • Directed prompt engineering logic for multi-purpose dataset creation.
  • Preprocessing Layer:
    • Includes sliding-window context building, dataset cleaning, and key splitting.
  • Distributed Layer:
    • Ray-based JSONL cleanup, sharding, and batch dispatching to models.
  • Fine-Tuning Layer:
    • Deepspeed pipeline trainer, LoRA config support, sliding context batching.

Project Structure

E.M.A/
β”œβ”€β”€ backend/                      # FastAPI backend
β”œβ”€β”€ configs/                     # YAML config files
β”œβ”€β”€ data_layer/                  # ChromaDB & ingestion logic
β”œβ”€β”€ distributed_data_layer/     # Large JSONL datasets for distributed QA
β”œβ”€β”€ distributed_processing/     # Ray actors and file distribution logic
β”œβ”€β”€ embedding/                  # Embedding generation scripts
β”œβ”€β”€ expansion_layer/            # Multi-model QA/NER workflows
β”œβ”€β”€ fine_tuning_layer/          # Deepspeed, QLoRA, dataset processing
β”œβ”€β”€ frontend/                   # React-based interface
β”œβ”€β”€ ingestion_layer/            # API/KB/Docs scraping utilities
β”œβ”€β”€ md_notes/                   # Markdown project notes
β”œβ”€β”€ model_layer/                # LLM engine & tool coordination logic
β”œβ”€β”€ preprocessing_layer/       # JSONL deduplication & formatting
β”œβ”€β”€ qa_generation/             # Legacy QA generation logic
β”œβ”€β”€ ray_cluster/               # Ray head/worker node management
β”œβ”€β”€ scripts/                   # Orchestration and utility scripts
β”œβ”€β”€ utilities/                 # Logging, validation, system tools
β”œβ”€β”€ workspace_one_workflows/  # Workspace ONE automation flows
└── .env / requirements.txt    # Environment & dependencies

Workflow

  1. Distributed Dataset Prep:

    • Shard, deduplicate, and clean raw datasets with distributed_cleaning_pipeline.py.
  2. Dataset Preparation:

    • Validate and clean input datasets (e.g., omnissa_apis_with_context_dataset.jsonl).
    • Generate embeddings and ingest them into ChromaDB.
  3. QA Pair Generation:

    • Run the QA generation pipeline to create question-answer pairs using run_qa_generation.py.
  4. Fine-Tuning:

    • Use run_fine_tuning.py to launch the deepspeed_trainer.py or pipeline_trainer.py.
    • Supports QLoRA, DeepSpeed ZeRO-3, sliding context batching, and Flash Attention.
  5. Deployment:

    • Deploy the fine-tuned model with the FastAPI backend for real-time inference with token streaming.

Future Enhancements

  • Support for additional question types.
  • Automated hyperparameter tuning for fine-tuning workflows.
  • Advanced monitoring and analytics for QA pair generation.
  • LoRA weight merging utilities
  • Fine-tuned model hub export and quantization
  • Dataset tokenizer profiling for context budget planning

πŸ–¨οΈ EpsonLink

A fully native Android WebView wrapper for USB-connected Epson receipt printers using the ePOS2 SDK. Built for Android Enterprise deployments with a clean MVVM architecture and structured JSON print support.

A dynamic staging tool designed to configure and provision devices via MDM assignment groups, tag logic, and relay APIs.

Python-based automation framework for interacting with Workspace ONE UEM APIs, featuring DTO mapping and REST abstraction.


πŸ› οΈ Projects Published So Far

  • πŸ“¦ WorkspaceONE-To-Intune-iOS
    Seamless COPE/BYOD migration utility for iOS MDM transitions.

  • πŸ”¬ EasyRest
    Lightweight REST client for debugging iOS APIs.

  • πŸ’¬ XMPPMessenger-iOS
    Secure real-time chat app built on XMPP.

  • πŸ’ The Proposal
    SpriteKit game with a surprise engagement ending.

  • πŸ§ͺ IPCDeviceUtility
    Internal sled diagnostic tool with MSR/scanner/firmware support.


🧰 Tools I Work With

  • Mobile: Swift, Objective-C, Kotlin, Java
  • Frontend: React Typescript Next.js Tailwind CSS PHP
  • Backend: Python (FastAPI, Flask), Node.js, Java (SpringBoot), C++, C# .NET, PHP
  • DevOps: GitHub Actions, CI/CD, scripting, Jenkins, Sonarqube
  • AI/LLM: LangChain, QLoRA, vector DBs, agent frameworks
  • Platform Agnostic

πŸ“« Get in Touch


β€œBuild fast. Stay secure. Leave tech cleaner than you found it.”

Popular repositories Loading

  1. .github.io-qofas-passive-readout .github.io-qofas-passive-readout Public

    TeX 1

  2. EasyRest EasyRest Public

  3. Noblesite Noblesite Public

    Personal Webpage

    TypeScript

  4. llm_engineering_classes llm_engineering_classes Public

    Jupyter Notebook

  5. Trust_Score_Mobile Trust_Score_Mobile Public

    Swift

  6. The_Proposal The_Proposal Public

    legacy Swift SpriteKit proposal game

    Swift