AstroCoder: AI Code Generator (Java to Python and Vice Versa)

This repository contains the source code for AstroCoder, an AI-based Code Generator that automates code generation, optimization, and bug fixing for Java and Python. AstroCoder is built using a hybrid RLHF (Reinforcement Learning with Human Feedback) Transformer model, inspired by RNN (Recurrent Neural Networks) and LSTM (Long Short-Term Memory). The AI system is designed to understand code context, similar to next-word prediction models, and enables efficient cross-language code migration with 85% precision in resolving migration errors.

AstroCoder can generate, improve, and migrate code across multiple languages, significantly reducing development time in large-scale software projects. The system has a user-friendly interface that enhances developer efficiency by 40%, making it a valuable tool for automating software engineering tasks.

Features

1. Java to Python Code Conversion

Seamless conversion of Java code into Python and vice versa.
Handles syntax adaptation, logical flow adjustments, and conversion of core programming constructs.
Uses AI-driven optimization to apply best coding practices across languages.

2. Automated Code Generation from Natural Language

Converts natural language queries into executable code (e.g., "Generate a sorting algorithm in Java").
Supports functions, classes, algorithms, and complete scripts.
Understands query context to generate relevant, structured code.

3. Hybrid RLHF Code Fixing and Optimization

Detects and resolves syntax errors, logical issues, and inefficiencies in code.
Provides AI-based suggestions to improve readability, performance, and maintainability.
Automates cross-language migration with an 85% accuracy rate in fixing errors.

4. Transformer-based Learning with NLP

The AI model is built using a Transformer-based hybrid RLHF approach, improving contextual understanding.
Inspired by RNN and LSTM, adapted to handle structured programming languages.
Trained on diverse datasets of Java and Python to learn best practices and common patterns.

5. Customizable and Scalable for Large Codebases

Adaptable for enterprise-level applications with large codebases.
Can be fine-tuned for additional languages, supporting diverse software development environments.
Designed for collaborative development, making it ideal for team-based projects.

Technologies Used

Deep Learning & AI: Transformer-based hybrid RLHF model, inspired by RNN and LSTM.
Natural Language Processing (NLP): Enables contextual understanding of queries.
TensorFlow/PyTorch: Used for training and deploying the AI model.
Flask/Django: Backend API for serving the AI-based code generation system.
Java & Python: Core languages for code conversion and optimization.
Git/GitHub: Version control with CI/CD pipelines for automated testing and deployment.

Demo

View the live demo of AstroCoder on GitHub:

Demo Video

This demo showcases AstroCoder's ability to generate, optimize, and fix code in real-time, demonstrating its cross-language capabilities and AI-driven automation.

Getting Started

Prerequisites

Python 3.x
Java Development Kit (JDK)
TensorFlow / PyTorch
Flask / Django for API deployment

CodeT5-Based Java-to-Python Translation

🔹 Preprocessing Enhancements

1. Custom Tokenization

Prefixed Java code with "Translate Java to Python: " to improve task recognition.

2. Padding Handling

Replaced padding tokens with -100 to prevent loss function bias.

3. Dataset Sampling & Splitting

Used 40% sampled training data.
Applied an 80-20 train-test split for efficiency.

4. Tokenizer Optimization

Used RobertaTokenizer instead of the default T5 tokenizer for better code processing.

5. Efficient Data Processing

Converted datasets to TensorFlow format.
Used optimized batching & prefetching with tf.data.AUTOTUNE.

6. Multi-GPU & XLA Support

Enabled XLA optimization.
Used MirroredStrategy for distributed training.
Disabled auto-sharding for better resource allocation.

7. Custom Logging & Progress Bar

Implemented ETA tracking.
Adjusted learning rate dynamically per step.

🔹 Fine-Tuning Enhancements

Hyperparameter Updates:

Learning Rate: 3e-4 (faster convergence)
Weight Decay: 1e-4 (regularization)
Warmup Ratio: 0.2 (gradual learning rate increase)
Batch Size: 6 (optimized for GPU efficiency)
Max Sequence Length: 300 (handles longer code snippets)
Epochs: 8 (better convergence)

Model Selection:

Used TFT5ForConditionalGeneration (CodeT5) for enhanced Java-to-Python translation.

Custom Learning Rate Scheduler:

Applied warmup and gradient scaling for stable training.

Multi-GPU & Performance Boost:

Implemented XLA acceleration.
Utilized MirroredStrategy.
Optimized data pipeline with auto-tuning.

Additional Project: AI Healthcare Chatbot

In addition to AstroCoder, this project also includes an AI-powered Healthcare Chatbot. Using NLP and Transformer-based models, the chatbot can diagnose potential diseases based on user symptoms, providing insights and medical guidance. The chatbot leverages LLM (Large Language Models) and LangChain to deliver intelligent responses for medical queries.

This repository brings together advanced AI-driven code generation and healthcare diagnostics, showcasing the power of hybrid RLHF Transformers in real-world applications.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Hackathon		Hackathon
Model_		Model_
Upload		Upload
Gen AI Abstract.pdf		Gen AI Abstract.pdf
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
users.json		users.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AstroCoder: AI Code Generator (Java to Python and Vice Versa)

Features

1. Java to Python Code Conversion

2. Automated Code Generation from Natural Language

3. Hybrid RLHF Code Fixing and Optimization

4. Transformer-based Learning with NLP

5. Customizable and Scalable for Large Codebases

Technologies Used

Demo

Getting Started

Prerequisites

CodeT5-Based Java-to-Python Translation

🔹 Preprocessing Enhancements

1. Custom Tokenization

2. Padding Handling

3. Dataset Sampling & Splitting

4. Tokenizer Optimization

5. Efficient Data Processing

6. Multi-GPU & XLA Support

7. Custom Logging & Progress Bar

🔹 Fine-Tuning Enhancements

Hyperparameter Updates:

Model Selection:

Custom Learning Rate Scheduler:

Multi-GPU & Performance Boost:

Additional Project: AI Healthcare Chatbot

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

GunaShankar0213/AI-based-Code-Generator-Java-Python-

Folders and files

Latest commit

History

Repository files navigation

AstroCoder: AI Code Generator (Java to Python and Vice Versa)

Features

1. Java to Python Code Conversion

2. Automated Code Generation from Natural Language

3. Hybrid RLHF Code Fixing and Optimization

4. Transformer-based Learning with NLP

5. Customizable and Scalable for Large Codebases

Technologies Used

Demo

Getting Started

Prerequisites

CodeT5-Based Java-to-Python Translation

🔹 Preprocessing Enhancements

1. Custom Tokenization

2. Padding Handling

3. Dataset Sampling & Splitting

4. Tokenizer Optimization

5. Efficient Data Processing

6. Multi-GPU & XLA Support

7. Custom Logging & Progress Bar

🔹 Fine-Tuning Enhancements

Hyperparameter Updates:

Model Selection:

Custom Learning Rate Scheduler:

Multi-GPU & Performance Boost:

Additional Project: AI Healthcare Chatbot

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages