Skip to content
View Anshumaan031's full-sized avatar

Block or report Anshumaan031

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Anshumaan031/README.md

Hi there, I'm Anshumaan Phukan! 👋

🚀 About Me

Data Science and Generative AI professional with a Master of Science in Business Analytics from National University of Singapore. I specialize in using Large Language Models and advanced Data Science methodologies to optimize data-driven decision making and operational efficiency. Currently leading the TechSagar initiative at the Data Security Council of India.

🎓 Education

  • Master of Science in Business Analytics - National University of Singapore (July 2023 - Sept 2024)

    • Specialization in Advanced Machine Learning and AI
    • Master Thesis: Adaptive Forecasting Framework: A Hybrid LLM-Statistical Approach for Enhanced Zero Shot Prediction
  • Bachelor of Technology in Computer Science Engineering - Bennett University (August 2019 - May 2023)

    • Bachelor Thesis: Generative Adversarial Network Based Approach towards Synthetically Generating Insider Threat Scenarios

💼 Professional Experience

Technical Analyst (Project Lead) - Data Security Council of India, NASSCOM

Dec 2024 - Present

  • Leading the TechSagar initiative in collaboration with the National Cyber Security Coordinator's Office
  • Enhancing platform capabilities with a hybrid RAG pipeline for dynamic data retrieval
  • Overseeing India's emerging tech repository covering 27 technology areas and 4,000+ entities
  • Leveraging generative AI to improve retrieval precision and develop agentic data enrichment frameworks

Data Scientist - Johnson and Johnson, Singapore

Apr 2024 - Sept 2024

  • Developed RAG applications using LLMs to enhance financial Q&A assistants
  • Fine-tuned text generation models on confidential financial documents using QLora and Azure AI
  • Created TimeLLM hybrid framework combining generative AI with traditional forecasting methods
  • Engineered end-to-end financial commentary systems with Azure OpenAI

Research Intern - Georgia Institute of Technology, US

January 2023 - May 2023

  • Integrated ensemble methods for insider threat detection, increasing rates by ~15%
  • Fostered collaborative efforts between AI, cybersecurity, and system administration
  • Implemented Wasserstein Conditional GANs (WCGANs) to mitigate security vulnerabilities

Computer Vision Intern - UAV Tech Pvt Ltd, India

Jan 2022 - May 2022

  • Developed autonomous drone prototypes using Robot Operating System
  • Designed multifunctional capabilities including real-time object detection and precision landing
  • Created custom deep learning models achieving 95% detection rates

🏆 Projects & Achievements

  • NUSights: Gen AI-based fintech solution bridging efficiency gaps in financial analysis (International Fintech Competition, Chengdu, China – Runner-Up)
  • Enhancing Digital Banking UX through Generative AI: Strategic analysis using LLMs
  • Multi-Agentic Company Data Researcher Framework: Intelligent system for cataloging emerging Indian tech startups
  • Advanced Maritime Traffic Analysis: Sophisticated data warehouse with star schema design for AIS messages
  • Image-Based Nutritional Estimation: Novel approach including Singaporean street foods

🛠️ Skills & Technologies

Knowledge Domains

  • Machine Learning & Data Science
  • Generative AI & LLMs
  • NLP & Network Analysis
  • Explainable AI
  • DataOps & MLOps
  • Prompt Engineering
  • Time Series Analysis

Programming Languages

  • Python Python
  • SQL SQL
  • C++ C++
  • HTML/CSS HTML5 CSS3
  • Xquery

Frameworks & Libraries

  • Scikit-learn Scikit-learn
  • PyTorch PyTorch
  • Transformers Transformers
  • Keras Keras
  • Huggingface Huggingface
  • Langchain Langchain
  • LlamaIndex LlamaIndex
  • Pydantic AI Pydantic
  • Crew AI CrewAI
  • Langsmith Langsmith
  • Qdrant Qdrant

Tools & Platforms

  • Tableau Tableau
  • AWS AWS
  • Git Git
  • SAS Viya SAS
  • ExistDB ExistDB
  • Neo4j Neo4j
  • Docker Docker
  • Apache Airflow Airflow
  • MLFlow MLFlow
  • FastAPI FastAPI
  • Supabase Supabase
  • MongoDB Compass MongoDB
  • n8n n8n
  • Langflow Langflow

📝 Publications & Research

  • Generative Adversarial Network Based Approach towards Synthetically Generating Insider Threat Scenarios
  • Brain Abnormality Categorization using MRI Scans
  • Carbon emission forecasting across different energy sectors using ARIMA
  • Hydroponics using IOT and Machine Learning

🏆 Certifications

  • Azure certified AI fundamentals and DP fundamentals
  • Google: elements of AI certification, GoogleCloud Engineering and Data Science track
  • Deep Learning Fundamentals - NVIDIA
  • SAS Advanced visual Analytics and Advanced Statistical Analysis

🌐 Languages

  • English - Professional
  • Hindi - Professional
  • Assamese - Professional

🎯 Extra-Curricular

  • National ranked Tennis Player
  • Squash player
  • Core member of Bennett AI Society
  • Bennett University Sports Committee member

📫 Contact Me

Pinned Loading

  1. n8n-MCP-Agent n8n-MCP-Agent Public

    n8n workflow demonstrating an AI agent using Model Context Protocol (MCP) to dynamically discover and execute external tools from Brave Search and Convex with persistent PostgreSQL chat memory.

    2

  2. Web-knowledge-Crawler Web-knowledge-Crawler Public

    A distributed Model Context Protocol (MCP) server implementing an intelligent web ingestion and retrieval-augmented generation pipeline for AI agents and coding assistants.

    Python 8 5

  3. Parsflow Parsflow Public

    Building a comprehensive all in one document parsing system with with intelligent image descriptors, MCP support, REST endpoints, and agentic framework with user defined custom parsing pipelines.

    Python

  4. CardinalQuery CardinalQuery Public

    An autonomous LLM-powered SQL agent leveraging multi-agent orchestration, vector embeddings, and dynamic schema pruning to enable natural language querying over high-cardinality databases exceeding…

    Python

  5. Clinical-Extract Clinical-Extract Public

    Clinical trial document processing platform with AI-powered parsing, structured data extraction via LangExtract, real-time analytics dashboards, Supabase integration, and autonomous LangGraph workf…

    Python 1

  6. Data-Anonymization-System Data-Anonymization-System Public

    a solution designed to detect and anonymize Personally Identifiable Information (PII) in various types of data. Built with Microsoft Presidio as its core engine, the system provides a robust, scala…

    Python