# Preparing documents

In [1]:
# @title Proposal `proposal`

proposal = """
# Project Proposal: Intelligent Data Integration Framework

## Project Name
**Intelligent Data Integration Framework**

## Project Summary
The Intelligent Data Integration Framework aims to revolutionize data integration processes by deploying an AI-driven solution that automates schema matching, data transformation, and data cleansing. This initiative is designed to minimize manual data handling efforts and enhance data quality across disparate data sources.

## Business Case / Problem Statement
In today's data-driven environment, organizations struggle with integrating data from various sources, each with unique schemas and formats. This process is typically labor-intensive, prone to errors, and requires significant human resources. The current manual integration methodologies are inefficient and cannot meet the dynamic demands of businesses seeking real-time data insights. Hence, there is a compelling need for an intelligent framework that simplifies and automates these tasks to enhance operational efficiency and data quality.

## Expected Outcomes
- **Operational Improvements**: Reduction in manual data integration efforts by over 50%, leading to more efficient utilization of human resources.
- **Technical Improvements**: Enhanced data quality through automated detection and correction of anomalies, missing values, and duplicates, resulting in reliable datasets for analysis.
- **Monetary Value**: Projected cost savings of approximately $500,000 annually due to reduced manual processing and error correction efforts.
- **Efficiency Gains**: The introduction of a drag-and-drop interface for workflow management and real-time analytics is expected to decrease the time required for integration tasks by 60%.

## Monetary Value
The monetary impact includes substantial cost savings, estimated at $500,000 annually, achieved through the reduction of manual data processing tasks and error correction. Additionally, improved data accuracy will drive better decision-making and potentially increase revenue by leveraging high-quality data for strategic initiatives.

## Time Value
The framework is anticipated to deliver significant time savings by automating data integration processes, reducing the time spent on these tasks by approximately 60%. This efficiency gain translates into quicker data availability for analysis and decision-making, ultimately accelerating business processes and responses to market changes.

## Project Sponsor / Owner
The project is sponsored and overseen by the Team Leadership, with Stephanie Harris serving as the Director and Justin Lee as the Vice-Director. Their strategic vision and leadership will guide the project to successful implementation and operation.

## Key Stakeholders
The key stakeholders encompass the Business Intelligence team, which plays a crucial role in the project's execution. The team includes:
- **Timothy Johnson**: Business Intelligence Lead
- **Cynthia Harris**: BI Analyst
- **Kathleen Stewart**: BI Analyst
- **Robert Stewart**: BI Analyst
- **Sarah Jones**: BI Analyst
- **Susan Moore**: BI Analyst
- **Linda Parker**: BI Analyst
- **Emily Collins**: BI Analyst
- **Jennifer Rogers**: BI Analyst
- **Rebecca Richardson**: BI Analyst

Each member contributes unique expertise to ensure the framework meets the organization's data integration needs effectively. Their involvement is critical in the design, implementation, and evaluation phases, ensuring the solution aligns with business requirements and goals.
"""

In [2]:
# @title Requirements `requirements`
requirements = """
# Requirements Document for Intelligent Data Integration Framework

## Project Name
**Intelligent Data Integration Framework**

## Executive Summary
The Intelligent Data Integration Framework aims to automate and enhance the data integration process by leveraging artificial intelligence. This framework is designed to manage data from various sources with distinct schemas, ensuring efficient data transformation and cleansing. The use of AI in schema matching, anomaly detection, and data correction will significantly reduce manual intervention, improve data quality, and streamline ETL processes.

## Tasks and Task Dependencies

### Task 1: Schema Matching Automation
- **Description**: Develop AI models capable of automatically matching schemas from disparate data sources.
- **Dependencies**: Requires initial data source analysis.
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Estimated Duration**: 4 weeks
- **Required Skills**: AI model training, schema analysis, Python programming.

### Task 2: Data Transformation and ETL Automation
- **Description**: Create automated data transformation rules and streamline ETL processes using AI-generated suggestions.
- **Dependencies**: Successful completion of schema matching models.
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Estimated Duration**: 6 weeks
- **Required Skills**: ETL process design, AI integration, data transformation techniques.

### Task 3: Data Cleansing Module Development
- **Description**: Develop a module to detect and correct anomalies, missing values, and duplicates.
- **Dependencies**: Integration with transformation rules.
- **Assigned Employees**: Susan Moore, Linda Parker
- **Estimated Duration**: 5 weeks
- **Required Skills**: Data cleansing techniques, anomaly detection, AI integration.

### Task 4: User Interface Design and Implementation
- **Description**: Design a drag-and-drop interface for workflow management and real-time analytics.
- **Dependencies**: Development of core framework functionalities.
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Estimated Duration**: 4 weeks
- **Required Skills**: UI/UX design, JavaScript, real-time data visualization.

### Task 5: Cloud Integration and Deployment
- **Description**: Ensure the framework supports integration with cloud providers and on-premises systems.
- **Dependencies**: Completion of core framework development.
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Estimated Duration**: 3 weeks
- **Required Skills**: Cloud computing, deployment strategies, API integration.

### Task 6: Continuous Learning and Feedback Loop Implementation
- **Description**: Develop a feedback loop to improve AI models through user corrections and preferences.
- **Dependencies**: Operational user interface and core functionalities.
- **Assigned Employees**: Entire team collaboration
- **Estimated Duration**: 2 weeks
- **Required Skills**: Machine learning, user feedback analysis, iterative model improvement.

## Business Rules / Constraints

- **Data Security**: Ensure compliance with data protection regulations, including GDPR and CCPA, by implementing robust security measures and encryption protocols.
- **Cloud Agnosticism**: Framework must remain cloud-agnostic, supporting seamless integration with AWS, Azure, Google Cloud, and on-premises systems.
- **Scalability**: The framework must be scalable to accommodate varying volumes of data and increasing integration demands.
- **User Accessibility**: Interface must be user-friendly, catering to users with varying levels of technical expertise, emphasizing ease of use and accessibility.

## Conclusion
This document outlines a comprehensive plan to develop the Intelligent Data Integration Framework. Each task is strategically arranged to ensure a smooth workflow, with dependencies clearly identified to facilitate efficient project management. With the team’s expertise and the outlined tasks, we are poised to achieve the project’s objectives, delivering significant operational and technical improvements while adhering to stringent business rules and constraints.
"""

In [3]:
# @title Team Allocation `team_alloc`
team_alloc = """
This document serves to elucidate the structure and responsibilities of the project team, ensuring clarity in roles and workload distribution.

## Team Structure and Role Assignments

### Project Oversight and Leadership

The **Intelligent Data Integration Framework** is overseen by the **Team Leadership**, which provides strategic direction and ensures alignment with the organization's objectives. The leadership team is comprised of:

- **Stephanie Harris** (Director): As the project director, Stephanie provides overarching guidance and strategic oversight, ensuring that project objectives align with corporate goals.
- **Justin Lee** (Vice-Director): Justin supports the director by managing operational aspects and ensuring the project adheres to timelines and budget constraints.

### Business Intelligence Team

The execution of the project is primarily managed by the **Business Intelligence Team**, which is responsible for the analytical and technical components of the framework. The team composition and respective roles are as follows:

- **Timothy Johnson** (Business Intelligence Lead, Senior): Timothy leads the BI team, coordinating efforts across various tasks and ensuring that the integration framework meets technical standards and business needs.
- **Cynthia Harris** (BI Analyst, Mid-Level): Cynthia focuses on schema matching automation, leveraging her expertise in AI model training and data analysis.
- **Kathleen Stewart** (BI Analyst, Junior): Kathleen is tasked with data transformation and ETL automation, contributing fresh insights and innovative approaches.
- **Robert Stewart** (BI Analyst, Mid-Level): Robert collaborates on schema matching automation, applying his analytical skills to enhance model accuracy and efficiency.
- **Sarah Jones** (BI Analyst, Mid-Level): Sarah assists in data transformation processes, ensuring ETL tasks are streamlined and effective.
- **Susan Moore** (BI Analyst, Senior): Susan leads the development of the data cleansing module, utilizing her experience in anomaly detection and correction techniques.
- **Linda Parker** (BI Analyst, Mid-Level): Linda supports data cleansing efforts, focusing on ensuring data integrity and quality.
- **Emily Collins** (BI Analyst, Junior): Emily is responsible for the design and implementation of the user interface, integrating user-friendly features for workflow management.
- **Jennifer Rogers** (BI Analyst, Mid-Level): Jennifer collaborates on UI design, enhancing real-time analytics and data visualization capabilities.
- **Rebecca Richardson** (BI Analyst, Mid-Level): Rebecca manages cloud integration and deployment, ensuring the framework's compatibility with various cloud providers.

## Department and Team Assignments

The project is assigned to the **Business Intelligence Department**, which supervises the overall execution and ensures that the project aligns with data strategy initiatives. The department's cohesive structure fosters an environment of collaboration, innovation, and continuous improvement.

## Workload Distribution

The workload is distributed based on seniority and expertise, ensuring that each team member's skills are optimally utilized. The leadership team commits approximately 20% of their time to strategic oversight. Senior BI analysts, like Timothy and Susan, allocate up to 50% of their capacity to lead critical tasks, while mid-level analysts, such as Cynthia and Robert, dedicate around 40% to their specialized areas. Junior analysts, including Kathleen and Emily, contribute approximately 30% of their efforts, focusing on learning and support activities.

This structured allocation guarantees that the project progresses systematically while maintaining a balance between innovation and operational efficiency. The team's diverse expertise and disciplined workload distribution are pivotal in realizing the ambitious goals set for the Intelligent Data Integration Framework.
"""

In [4]:
# @title Roadmap `roadmap`
roadmap = """
# Project Roadmap / Timeline: Intelligent Data Integration Framework

## Introduction
The development of the Intelligent Data Integration Framework is structured around a detailed roadmap, emphasizing pivotal milestones, task dependencies, and a comparison of projected versus actual completion times. This document provides a comprehensive timeline for the project, enabling effective tracking and management of progress.

## Major Milestones

1. **Initial Data Source Analysis**
   - **Completion Date**: Week 2
   - **Objective**: Conduct thorough analysis of existing data sources to understand schema structures and integration requirements.
   - **Lead**: Cynthia Harris and Robert Stewart

2. **Schema Matching Automation Development**
   - **Completion Date**: Week 6
   - **Objective**: Develop and test AI models for schema matching.
   - **Dependencies**: Completion of data source analysis.
   - **Lead**: Cynthia Harris and Robert Stewart

3. **Data Transformation and ETL Automation**
   - **Completion Date**: Week 12
   - **Objective**: Implement automated data transformation rules and streamline ETL processes.
   - **Dependencies**: Successful implementation of schema matching automation.
   - **Lead**: Kathleen Stewart and Sarah Jones

4. **Data Cleansing Module Development**
   - **Completion Date**: Week 17
   - **Objective**: Create a module to detect and correct data anomalies.
   - **Dependencies**: Integration of transformation rules.
   - **Lead**: Susan Moore and Linda Parker

5. **User Interface Design and Implementation**
   - **Completion Date**: Week 21
   - **Objective**: Develop a user-friendly drag-and-drop interface for workflow management.
   - **Dependencies**: Completion of core framework functionalities.
   - **Lead**: Emily Collins and Jennifer Rogers

6. **Cloud Integration and Deployment**
   - **Completion Date**: Week 24
   - **Objective**: Ensure compatibility with cloud providers and on-premises systems.
   - **Dependencies**: Finalization of core framework and UI design.
   - **Lead**: Timothy Johnson and Rebecca Richardson

7. **Continuous Learning and Feedback Loop Implementation**
   - **Completion Date**: Week 26
   - **Objective**: Implement feedback loop for AI model improvement.
   - **Dependencies**: Operational UI and core functionalities.
   - **Lead**: Entire team collaboration

## Dependencies Between Tasks
The roadmap is meticulously structured to ensure a logical flow of tasks. Initial data source analysis lays the groundwork for schema matching automation. The successful development of schema matching is crucial for the subsequent automation of data transformation and ETL processes. The functionality of the data cleansing module is contingent upon the integration of transformation rules. The design and implementation of the user interface are dependent on the completion of core framework functionalities. Finally, cloud integration cannot proceed until the core framework and UI design are finalized.

## Projected vs. Actual Completion Times
Tracking projected versus actual completion times is vital for identifying potential delays and initiating corrective actions. Each milestone will be monitored weekly, with detailed reports generated to assess progress. Adjustments to timelines will be made as required to maintain project momentum and ensure timely delivery.

## Story Points / Effort Estimations
Effort estimations are quantified using story points, a measure of workload that accounts for complexity and time investment:

- **Schema Matching Automation**: 50 story points
- **Data Transformation and ETL Automation**: 70 story points
- **Data Cleansing Module Development**: 60 story points
- **User Interface Design and Implementation**: 40 story points
- **Cloud Integration and Deployment**: 30 story points
- **Continuous Learning and Feedback Loop**: 20 story points

These estimations guide resource allocation and task prioritization, ensuring a balanced distribution of efforts across the team.

## Conclusion
The roadmap for the Intelligent Data Integration Framework is designed to facilitate efficient project execution, with clear milestones, dependencies, and effort estimations. Continuous monitoring and adjustment will ensure that the project adheres to its strategic objectives, delivering a robust, intelligent framework that transforms data integration processes.
"""

In [None]:
#@title JIRA tasks `jira_tasks`
jira_tasks = """
# JIRA Assignment Document: Intelligent Data Integration Framework

## Task 1: Schema Matching Algorithm Development
- **Issue Type**: Epic
- **Issue Priority**: Critical
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Timothy Johnson
- **Issue Status**: In Progress
- **Story Points**: 50
- **Required Tasks**: None
- **Issue Summary/Description**: Develop AI algorithms for automatic schema matching.
- **Comments / Notes**: Initial model prototypes show promising accuracy.

---

## Task 2: Data Transformation Rule Generation
- **Issue Type**: Story
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 1 - Schema Matching Algorithm Development
- **Issue Summary/Description**: Generate rules for data transformation using AI.
- **Comments / Notes**: Awaiting schema matching results.

---

## Task 3: ETL Process Automation
- **Issue Type**: Story
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Linda Parker
- **Issue Status**: Open
- **Story Points**: 70
- **Required Tasks**: Task 2 - Data Transformation Rule Generation
- **Issue Summary/Description**: Automate ETL processes using AI-generated rules.
- **Comments / Notes**: Initial outline completed.

---

## Task 4: Data Cleansing Module Implementation
- **Issue Type**: Story
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Timothy Johnson
- **Issue Status**: Open
- **Story Points**: 60
- **Required Tasks**: Task 3 - ETL Process Automation
- **Issue Summary/Description**: Implement module for anomaly detection and correction.
- **Comments / Notes**: Research phase complete.

---

## Task 5: User Interface Design
- **Issue Type**: Story
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Rebecca Richardson
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 4 - Data Cleansing Module Implementation
- **Issue Summary/Description**: Design drag-and-drop UI for workflow management.
- **Comments / Notes**: Mockups being finalized.

---

## Task 6: Cloud Integration Strategy
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Stephanie Harris
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 5 - User Interface Design
- **Issue Summary/Description**: Develop strategy for cloud-agnostic integration.
- **Comments / Notes**: Initial tests with AWS and Azure successful.

---

## Task 7: AI Model Training & Optimization
- **Issue Type**: Epic
- **Issue Priority**: Critical
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Justin Lee
- **Issue Siatus**: In Progress
- **Story Points**: 80
- **Required Tasks**: Task 1 - Schema Matching Algorithm Development
- **Issue Summary/Description**: Train and optimize AI models for schema recognition.
- **Comments / Notes**: Data set preparation underway.

---

## Task 8: Feedback Loop Development
- **Issue Type**: Story
- **Issue Priority**: Major
- **Assigned Employees**: Entire Team
- **Advisor Employees**: Stephanie Harris
- **Issue Status**: Open
- **Story Points**: 50
- **Required Tasks**: Task 7 - AI Model Training & Optimization
- **Issue Summary/Description**: Implement feedback loop for AI model improvement.
- **Comments / Notes**: Planning phase initiated.

---

## Task 9: Data Quality Assessment Tools
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Timothy Johnson
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 4 - Data Cleansing Module Implementation
- **Issue Summary/Description**: Develop tools for assessing data quality.
- **Comments / Notes**: Requirements gathering in progress.

---

## Task 10: Workflow Automation Scripts
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Jennifer Rogers
- **Issue Status**: Open
- **Story Points**: 45
- **Required Tasks**: Task 3 - ETL Process Automation
- **Issue Summary/Description**: Create scripts for automated workflow management.
- **Comments / Notes**: Script outline drafted.

---

## Task 11: Real-Time Analytics Dashboard
- **Issue Type**: Story
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Rebecca Richardson
- **Issue Status**: Open
- **Story Points**: 50
- **Required Tasks**: Task 5 - User Interface Design
- **Issue Summary/Description**: Develop dashboard for real-time data analytics.
- **Comments / Notes**: Dashboard design in conceptual stage.

---

## Task 12: Security Protocol Implementation
- **Issue Type**: Task
- **Issue Priority**: Critical
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Justin Lee
- **Issue Status**: Open
- **Story Points**: 60
- **Required Tasks**: Task 6 - Cloud Integration Strategy
- **Issue Summary/Description**: Implement security protocols for data protection.
- **Comments / Notes**: Compliance with GDPR and CCPA required.

---

## Task 13: API Development for Integration
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 55
- **Required Tasks**: Task 7 - AI Model Training & Optimization
- **Issue Summary/Description**: Develop APIs for seamless integration with external systems.
- **Comments / Notes**: API design in progress.

---

## Task 14: Load Testing and Performance Optimization
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Kathleen Stewart
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 9 - Data Quality Assessment Tools
- **Issue Summary/Description**: Perform load testing and optimize performance.
- **Comments / Notes**: Load testing tools identified.

---

## Task 15: Documentation and User Manuals
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Sarah Jones
- **Issue Status**: Open
- **Story Points**: 25
- **Required Tasks**: Task 11 - Real-Time Analytics Dashboard
- **Issue Summary/Description**: Create comprehensive documentation and user manuals.
- **Comments / Notes**: Outline for documentation created.

---

## Task 16: Anomaly Detection Algorithm Enhancement
- **Issue Type**: Story
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Cynthia Harris
- **Issue Status**: Open
- **Story Points**: 45
- **Required Tasks**: Task 9 - Data Quality Assessment Tools
- **Issue Summary/Description**: Enhance algorithms for better anomaly detection.
- **Comments / Notes**: Algorithm review in progress.

---

## Task 17: Continuous Integration Setup
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Robert Stewart
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 12 - Security Protocol Implementation
- **Issue Summary/Description**: Set up continuous integration environment for development.
- **Comments / Notes**: Tools selection phase complete.

---

## Task 18: User Feedback Collection Mechanism
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Entire Team
- **Advisor Employees**: Stephanie Harris
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 8 - Feedback Loop Development
- **Issue Summary/Description**: Develop mechanism for collecting user feedback.
- **Comments / Notes**: Feedback form draft completed.

---

## Task 19: Training and Support Plan
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Linda Parker
- **Issue Status**: Open
- **Story Points**: 25
- **Required Tasks**: Task 15 - Documentation and User Manuals
- **Issue Summary/Description**: Develop training and support plan for users.
- **Comments / Notes**: Training schedule being developed.

---

## Task 20: Data Pipeline Optimization
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 10 - Workflow Automation Scripts
- **Issue Summary/Description**: Optimize data pipeline for efficiency.
- **Comments / Notes**: Identifying bottlenecks in current pipeline.

---

## Task 21: Integration with Machine Learning Tools
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Jennifer Rogers
- **Issue Status**: Open
- **Story Points**: 50
- **Required Tasks**: Task 7 - AI Model Training & Optimization
- **Issue Summary/Description**: Integrate with existing ML tools for enhanced capabilities.
- **Comments / Notes**: Compatibility assessment underway.

---

## Task 22: Cross-Platform Compatibility Testing
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Timothy Johnson
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 6 - Cloud Integration Strategy
- **Issue Summary/Description**: Ensure compatibility across different platforms.
- **Comments / Notes**: Testing environments set up.

---

## Task 23: Real-Time Data Monitoring Tools
- **Issue Type**: Story
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Rebecca Richardson
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 11 - Real-Time Analytics Dashboard
- **Issue Summary/Description**: Develop tools for real-time data monitoring.
- **Comments / Notes**: Monitoring requirements identified.

---

## Task 24: Scalability Testing
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Justin Lee
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 14 - Load Testing and Performance Optimization
- **Issue Summary/Description**: Conduct scalability testing for large data volumes.
- **Comments / Notes**: Test cases prepared.

---

## Task 25: Data Security Audit
- **Issue Type**: Task
- **Issue Priority**: Critical
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Stephanie Harris
- **Issue Status**: Open
- **Story Points**: 60
- **Required Tasks**: Task 12 - Security Protocol Implementation
- **Issue Summary/Description**: Perform comprehensive data security audit.
- **Comments / Notes**: Audit checklist created.

---

## Task 26: User Experience Evaluation
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Sarah Jones
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 5 - User Interface Design
- **Issue Summary/Description**: Evaluate user experience and gather feedback.
- **Comments / Notes**: User testing sessions scheduled.

---

## Task 27: Cloud Resource Optimization
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Robert Stewart
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 6 - Cloud Integration Strategy
- **Issue Summary/Description**: Optimize resource usage for cloud deployments.
- **Comments / Notes**: Resource monitoring tools selected.

---

## Task 28: Integration Testing
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Cynthia Harris
- **Issue Status**: Open
- **Story Points**: 50
- **Required Tasks**: Task 22 - Cross-Platform Compatibility Testing
- **Issue Summary/Description**: Conduct integration testing across entire framework.
- **Comments / Notes**: Test plan in development.

---

## Task 29: Performance Benchmarking
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Timothy Johnson
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 14 - Load Testing and Performance Optimization
- **Issue Summary/Description**: Benchmark performance against industry standards.
- **Comments / Notes**: Benchmark metrics defined.

---

## Task 30: Automated Testing Suite Development
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Jennifer Rogers
- **Issue Status**: Open
- **Story Points**: 45
- **Required Tasks**: Task 17 - Continuous Integration Setup
- **Issue Summary/Description**: Develop automated testing suite for framework.
- **Comments / Notes**: Testing framework selection in progress.

---

## Task 31: Data Governance Policy
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Justin Lee
- **Issue Status**: Open
- **Story Points**: 25
- **Required Tasks**: Task 25 - Data Security Audit
- **Issue Summary/Description**: Establish data governance policies and procedures.
- **Comments / Notes**: Policy draft reviewed by stakeholders.

---

## Task 32: Machine Learning Model Integration
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Rebecca Richardson
- **Issue Status**: Open
- **Story Points**: 55
- **Required Tasks**: Task 21 - Integration with Machine Learning Tools
- **Issue Summary/Description**: Integrate trained ML models into the framework.
- **Comments / Notes**: Model integration guidelines prepared.

---

## Task 33: New Feature Development
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Linda Parker
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 26 - User Experience Evaluation
- **Issue Summary/Description**: Develop and implement new features based on user feedback.
- **Comments / Notes**: Feature list prioritized.

---

## Task 34: System Recovery Plan
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Justin Lee
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 12 - Security Protocol Implementation
- **Issue Summary/Description**: Develop a comprehensive system recovery plan.
- **Comments / Notes**: Recovery scenarios identified.

---

## Task 35: Data Transformation Rule Optimization
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 2 - Data Transformation Rule Generation
- **Issue Summary/Description**: Optimize data transformation rules for efficiency.
- **Comments / Notes**: Rule optimization techniques under review.

---

## Task 36: User Training Workshops
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Kathleen Stewart
- **Issue Status**: Open
- **Story Points**: 25
- **Required Tasks**: Task 19 - Training and Support Plan
- **Issue Summary/Description**: Conduct workshops for user training and skill enhancement.
- **Comments / Notes**: Workshop materials prepared.

---

## Task 37: AI Model Validation
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Timothy Johnson
- **Issue Status**: Open
- **Story Points**: 45
- **Required Tasks**: Task 7 - AI Model Training & Optimization
- **Issue Summary/Description**: Validate AI models to ensure accuracy and reliability.
- **Comments / Notes**: Validation criteria established.

---

## Task 38: Data Source Connectivity
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Rebecca Richardson
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 28 - Integration Testing
- **Issue Summary/Description**: Establish connectivity with multiple data sources.
- **Comments / Notes**: Connectivity protocols defined.

---

## Task 39: Automated Deployment Pipeline
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Justin Lee
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 17 - Continuous Integration Setup
- **Issue Summary/Description**: Set up an automated deployment pipeline for framework updates.
- **Comments / Notes**: Deployment strategy outlined.

---

## Task 40: User Interface Testing
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Sarah Jones
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 11 - Real-Time Analytics Dashboard
- **Issue Summary/Description**: Conduct thorough testing of the user interface.
- **Comments / Notes**: Test cases being developed.

---

## Task 41: Anomaly Reporting System
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Cynthia Harris
- **Issue Status**: Open
- **Story Points**: 45
- **Required Tasks**: Task 16 - Anomaly Detection Algorithm Enhancement
- **Issue Summary/Description**: Develop a system for reporting detected anomalies.
- **Comments / Notes**: Reporting system requirements gathered.

---

## Task 42: Data Backup Procedures
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Robert Stewart
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 34 - System Recovery Plan
- **Issue Summary/Description**: Establish procedures for regular data backups.
- **Comments / Notes**: Backup schedule proposed.

---

## Task 43: Integration with External APIs
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 50
- **Required Tasks**: Task 13 - API Development for Integration
- **Issue Summary/Description**: Integrate framework with existing external APIs.
- **Comments / Notes**: API documentation reviewed.

---

## Task 44: User Feedback Analysis
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Linda Parker
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 18 - User Feedback Collection Mechanism
- **Issue Summary/Description**: Analyze user feedback for continuous improvement.
- **Comments / Notes**: Feedback analysis toolkit selected.

---

## Task 45: Continuous Deployment Setup
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Justin Lee
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 39 - Automated Deployment Pipeline
- **Issue Summary/Description**: Establish a continuous deployment process for rapid updates.
- **Comments / Notes**: Deployment scripts drafted.

---

## Task 46: Data Privacy Compliance
- **Issue Type**: Task
- **Issue Priority**: Critical
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Stephanie Harris
- **Issue Status**: Open
- **Story Points**: 60
- **Required Tasks**: Task 25 - Data Security Audit
- **Issue Summary/Description**: Ensure compliance with data privacy regulations.
- **Comments / Notes**: Compliance checklist completed.

---

## Task 47: Data Integration Workflow Templates
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Jennifer Rogers
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 10 - Workflow Automation Scripts
- **Issue Summary/Description**: Develop templates for common data integration workflows.
- **Comments / Notes**: Template design phase ongoing.

---

## Task 48: Real-Time Data Alert System
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Rebecca Richardson
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 23 - Real-Time Data Monitoring Tools
- **Issue Summary/Description**: Implement an alert system for real-time data issues.
- **Comments / Notes**: Alert criteria being finalized.

---

## Task 49: User Role Management System
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Robert Stewart
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 12 - Security Protocol Implementation
- **Issue Summary/Description**: Develop a system for managing user roles and permissions.
- **Comments / Notes**: Role management requirements identified.

---

## Task 50: Integration with Data Lakes
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 50
- **Required Tasks**: Task 38 - Data Source Connectivity
- **Issue Summary/Description**: Enable integration with popular data lake solutions.
- **Comments / Notes**: Data lake compatibility assessment ongoing.

---

## Task 51: User Interface Accessibility Enhancements
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Sarah Jones
- **Issue Status**: Open
- **Story Points**: 25
- **Required Tasks**: Task 5 - User Interface Design
- **Issue Summary/Description**: Enhance accessibility features of the user interface.
- **Comments / Notes**: Accessibility guidelines reviewed.

---

## Task 52: Automated Data Archiving
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Timothy Johnson
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 42 - Data Backup Procedures
- **Issue Summary/Description**: Implement automated data archiving for historical records.
- **Comments / Notes**: Archiving strategy in development.

---

## Task 53: Change Management Process
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Justin Lee
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 45 - Continuous Deployment Setup
- **Issue Summary/Description**: Establish a process for managing changes to the framework.
- **Comments / Notes**: Change management policy drafted.

---

## Task 54: Data Quality Dashboard
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Kathleen Stewart
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 23 - Real-Time Data Monitoring Tools
- **Issue Summary/Description**: Develop a dashboard for monitoring data quality metrics.
- **Comments / Notes**: Dashboard requirements gathered.

---

## Task 55: Data Transformation Pipeline Testing
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 35 - Data Transformation Rule Optimization
- **Issue Summary/Description**: Test and validate data transformation pipelines.
- **Comments / Notes**: Test cases being defined.

---

## Task 56: Continuous Improvement Plan
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Entire Team
- **Advisor Employees**: Stephanie Harris
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 18 - User Feedback Collection Mechanism
- **Issue Summary/Description**: Develop a plan for continuous improvement of the framework.
- **Comments / Notes**: Improvement opportunities identified.

---

## Task 57: Data Integration API Documentation
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Cynthia Harris, Robert Stewart
- **Advisor Employees**: Rebecca Richardson
- **Issue Status**: Open
- **Story Points**: 25
- **Required Tasks**: Task 43 - Integration with External APIs
- **Issue Summary/Description**: Document APIs for data integration processes.
- **Comments / Notes**: API documentation template ready.

---

## Task 58: Incident Response Plan
- **Issue Type**: Task
- **Issue Priority**: Critical
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Justin Lee
- **Issue Status**: Open
- **Story Points**: 60
- **Required Tasks**: Task 34 - System Recovery Plan
- **Issue Summary/Description**: Develop a plan for responding to data incidents.
- **Comments / Notes**: Incident response scenarios developed.

---

## Task 59: Data Integration Performance Tuning
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 20 - Data Pipeline Optimization
- **Issue Summary/Description**: Tune data integration processes for optimal performance.
- **Comments / Notes**: Performance tuning techniques under review.

---

## Task 60: Data Flow Monitoring System
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Linda Parker
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 23 - Real-Time Data Monitoring Tools
- **Issue Summary/Description**: Develop a system for monitoring data flow in real-time.
- **Comments / Notes**: Monitoring system design in progress.

---

## Task 61: User Acceptance Testing
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Entire Team
- **Advisor Employees**: Stephanie Harris
- **Issue Status**: Open
- **Story Points**: 50
- **Required Tasks**: Task 44 - User Feedback Analysis
- **Issue Summary/Description**: Conduct user acceptance testing to validate framework.
- **Comments / Notes**: Test plan being finalized.

---

## Task 62: Data Anonymization Tools
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Timothy Johnson
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 46 - Data Privacy Compliance
- **Issue Summary/Description**: Develop tools for data anonymization to protect privacy.
- **Comments / Notes**: Anonymization techniques being researched.

---

## Task 63: Feature Request Management
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Sarah Jones
- **Issue Status**: Open
- **Story Points**: 25
- **Required Tasks**: Task 44 - User Feedback Analysis
- **Issue Summary/Description**: Manage and prioritize feature requests from users.
- **Comments / Notes**: Feature request process established.

---

## Task 64: Data Migration Support
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Robert Stewart
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 38 - Data Source Connectivity
- **Issue Summary/Description**: Provide support for data migration to the new framework.
- **Comments / Notes**: Migration support process documented.

---

## Task 65: System Performance Review
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Entire Team
- **Advisor Employees**: Stephanie Harris
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 29 - Performance Benchmarking
- **Issue Summary/Description**: Conduct a comprehensive review of system performance.
- **Comments / Notes**: Performance review criteria developed.

---

## Task 66: Data Quality Improvement Initiatives
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Susan Moore, Linda Parker
- **Advisor Employees**: Cynthia Harris
- **Issue Status**: Open
- **Story Points**: 30
- **Required Tasks**: Task 54 - Data Quality Dashboard
- **Issue Summary/Description**: Initiate projects to improve data quality across the board.
- **Comments / Notes**: Improvement initiatives identified.

---

## Task 67: User Onboarding Process
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Emily Collins, Jennifer Rogers
- **Advisor Employees**: Linda Parker
- **Issue Status**: Open
- **Story Points**: 25
- **Required Tasks**: Task 36 - User Training Workshops
- **Issue Summary/Description**: Develop a process for onboarding new users to the framework.
- **Comments / Notes**: Onboarding process draft prepared.

---

## Task 68: Data Source Authentication
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Timothy Johnson, Rebecca Richardson
- **Advisor Employees**: Robert Stewart
- **Issue Status**: Open
- **Story Points**: 35
- **Required Tasks**: Task 49 - User Role Management System
- **Issue Summary/Description**: Implement authentication mechanisms for data sources.
- **Comments / Notes**: Authentication protocols being reviewed.

---

## Task 69: Real-Time Data Processing Enhancements
- **Issue Type**: Task
- **Issue Priority**: Major
- **Assigned Employees**: Kathleen Stewart, Sarah Jones
- **Advisor Employees**: Susan Moore
- **Issue Status**: Open
- **Story Points**: 40
- **Required Tasks**: Task 60 - Data Flow Monitoring System
- **Issue Summary/Description**: Enhance capabilities for real-time data processing.
- **Comments / Notes**: Enhancement opportunities explored.

---

## Task 70: Final Project Review and Handover
- **Issue Type**: Task
- **Issue Priority**: Critical
- **Assigned Employees**: Team Leadership
- **Advisor Employees**: Stephanie Harris, Justin Lee
- **Issue Status**: Open
- **Story Points**: 100
- **Required Tasks**: All Tasks
- **Issue Summary/Description**: Conduct a final review of the project and handover to operations.
- **Comments / Notes**: Handover checklist completed, final review scheduled.

-
"""

In [6]:
document_list = [proposal, requirements, team_alloc, roadmap, jira_tasks]

print(document_list)

["\n# Project Proposal: Intelligent Data Integration Framework\n\n## Project Name\n**Intelligent Data Integration Framework**\n\n## Project Summary\nThe Intelligent Data Integration Framework aims to revolutionize data integration processes by deploying an AI-driven solution that automates schema matching, data transformation, and data cleansing. This initiative is designed to minimize manual data handling efforts and enhance data quality across disparate data sources.\n\n## Business Case / Problem Statement\nIn today's data-driven environment, organizations struggle with integrating data from various sources, each with unique schemas and formats. This process is typically labor-intensive, prone to errors, and requires significant human resources. The current manual integration methodologies are inefficient and cannot meet the dynamic demands of businesses seeking real-time data insights. Hence, there is a compelling need for an intelligent framework that simplifies and automates these

# Initialize LangChain Transformer

In [7]:
!pip install  langchain langchain-core langchain-community -q

# For LLMGraphTransformer
!pip install langchain-openai langchain-experimental -q

# For GlinerGraphTransformer
# !pip install langchain-experimental gliner-spacy gliner glirel==1.0.0 loguru -q

Collecting langchain-community
  Downloading langchain_community-0.3.18-py3-none-any.whl.metadata (2.4 kB)
Collecting pq
  Downloading pq-1.9.1.tar.gz (15 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.8.1-py3-none-any.whl.metadata (3.5 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting python-dotenv>=0.21.0 (from pydantic-settin

In [8]:
import getpass
import os

if os.getenv("OPENAI_API_KEY") is None:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Input your openai API key")

# if os.getenv("HF_TOKEN") is None:
#     os.environ["HF_TOKEN"] = getpass.getpass("Input your huggingface token")

Input your openai API key··········


In [12]:
import os

from langchain_experimental.graph_transformers import LLMGraphTransformer #, GlinerGraphTransformer
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(temperature=0, model_name="gpt-4o")

template = ChatPromptTemplate([
    ("system", "You are a professional document-to-graph extracter. You will be given a list of project documents related to a new project that a data engineering company is planning to implement. For this, identify only ONE 'Project' with its relevant data, and make sure to exhaustively find all 'Employees' node and 'Tasks' node along with all relationship and node properties. If you don't finish you will lose your house and your wife will leave you so you better do it properly"),
])



llm_transformer = LLMGraphTransformer(
    llm=llm,
    allowed_nodes=["Employee", "Task", "Project"],
    allowed_relationships=[
        ("Employee", "DO", "Task"),
        ("Employee", "VERIFIES", "Task"),
        ("Task", "REQUIRES", "Task"),
        ("Project", "TRACKS", "Task"),
        ("Employee", "MANAGES", "Employee"),
    ],
    # prompt=template,
    strict_mode=True,
    relationship_properties=[],
    node_properties=[
        "EmpID",
        "FirstName",
        "LastName",
        "Email",
        "Role",
        "Department",
        "Team",
        "Seniority",
        "HireDate",
        "Salary",
        "ManagerID",
        "Skills",
        "Certifications",
        "ExperienceLevel",
        "CurrentWorkload",
        "Availability",
        "ProjectHistory",

        "TaskID",
        "Description",
        "AssignedEmployees",
        "Advisors",
        "PrecedingTasks",
        "StoryPoints",
        "StartTime",
        "EstimatedFinishTime",
        "Status",
        "ActualFinishTime",
        "RequiredSkills",
        "Priority",
        "TaskType",

        "ProjectID",
        "Name",
        "Summary",
        "BusinessCase",
        "ExpectedOutcomes",
        "MonetaryValue",
        "TimeValue",
        "Budget",
        "Timeline:StartDate",
        "Timeline:EstimatedEndDate",
        "Timeline:Milestones:Name",
        "Timeline:Milestones:Date",
        "Stakeholders",
        "Priority"
    ],
)

In [13]:
document_list = [jira_tasks]

In [14]:
from langchain_core.documents import Document

# Prepare project documents and convert them to graph documents
documents = [Document(page_content=text) for text in document_list]
graph_documents = llm_transformer.convert_to_graph_documents(documents)

print(f"Nodes:{graph_documents[0].nodes}")
print(f"Relationships:{graph_documents[0].relationships}")

LengthFinishReasonError: Could not parse response content as the length limit was reached - CompletionUsage(completion_tokens=16384, prompt_tokens=8934, total_tokens=25318, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))

In [11]:
# Setup GlinerGraphTransformer

glinel_transformer = GlinerGraphTransformer(
    allowed_nodes=["Employee", "Task", "Project"],
    allowed_relationships={
        "BELONGS_TO": {
            "from": "Employee",
            "to": "Project",
        },
        "DO_TASK": {
            "from": "Employee",
            "to": "Task",
        },
        "REQUIRES_TASK": {
            "from": "Task",
            "to": "Task",
        },
        "TRACKS_TASK": {
            "from": "Project",
            "to": "Task",
        },
    },
    device="cuda"
  )

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

README.md:   0%|          | 0.00/4.76k [00:00<?, ?B/s]

.gitattributes:   0%|          | 0.00/1.52k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/781M [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/781M [00:00<?, ?B/s]

gliner_config.json:   0%|          | 0.00/476 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/579 [00:00<?, ?B/s]

spm.model:   0%|          | 0.00/2.46M [00:00<?, ?B/s]



pytorch_model.bin:   0%|          | 0.00/1.87G [00:00<?, ?B/s]

glirel_config.json:   0%|          | 0.00/971 [00:00<?, ?B/s]

AttributeError: 'Namespace' object has no attribute 'max_entity_pair_distance'

In [None]:
from langchain_core.documents import Document

# Prepare project documents and convert them to graph documents
documents = [Document(page_content=text) for text in document_list]
graph_documents = glinel_transformer.convert_to_graph_documents(documents)

print(f"Nodes:{graph_documents[0].nodes}")
print(f"Relationships:{graph_documents[0].relationships}")