<a href="https://colab.research.google.com/github/appupradeep317/MyAIResearches/blob/main/sentencepience_bpe_training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
import sentencepiece as spm

In [6]:
inp = '''
The future of automation testing is inextricably linked with the advancements in Artificial Intelligence (AI). AI is not merely an enhancement; it's a transformative force that is fundamentally reshaping how software quality assurance is approached, making it faster, more intelligent, and more proactive.

Here's a breakdown of the key impacts and a description of the future:

1. Intelligent Test Case Generation and Optimization:

Beyond Scripting: Traditional automation relies on manually written scripts. AI, particularly generative AI, can analyze requirements (user stories, epics), code changes, and historical data to automatically generate comprehensive and diverse test cases. This moves beyond static scripts to dynamically adapting scenarios.
Optimal Test Coverage: AI algorithms can identify high-risk areas in the code and prioritize test execution, ensuring that critical functionalities are thoroughly covered while reducing redundant tests. This leads to more efficient resource allocation and faster feedback cycles.
Data-Driven Testing: AI can generate realistic and varied test data sets, including synthetic data, addressing data privacy concerns while ensuring comprehensive testing under various conditions.
2. Self-Healing and Adaptive Tests:

Reduced Maintenance: One of the biggest pain points in traditional automation testing is test script maintenance due to frequent UI or functionality changes. AI-powered "self-healing" tests can detect these changes and automatically update test scripts, significantly reducing manual effort and preventing unnecessary test failures.
Adaptability to Dynamic Applications: AI systems can learn from real user sessions and application behavior, allowing test scenarios to adapt dynamically to changes in the application under test, which is crucial for agile and rapidly iterating development environments.
3. Enhanced Defect Detection and Prediction:

Smarter Anomaly Detection: AI can analyze vast amounts of test results and identify subtle anomalies or unexpected behaviors that human testers might miss. This leads to more accurate defect detection.
Predictive Analytics: By analyzing historical bug data, code complexity, and usage patterns, AI can predict potential defect-prone areas in the codebase even before they occur. This allows QA teams to shift left and address issues proactively, leading to higher software quality and reduced rework.
Root Cause Analysis: AI can assist in identifying the root causes of failures, categorizing them into product defects, automation defects, or flakiness, thereby accelerating the debugging process.
4. Changing Role of Human Testers (Augmented Intelligence):

Focus on Higher-Value Tasks: AI will automate repetitive and mundane testing tasks, freeing up human testers to focus on more complex, exploratory, and user-centric testing.
Strategic Oversight: Human testers will play a crucial role in validating and refining AI-generated tests, ensuring their quality and alignment with business objectives. They will be responsible for interpreting AI insights and making informed decisions.
New Skillsets: The future will demand testers with a blend of traditional QA skills and expertise in AI, machine learning, and data analysis. Understanding how AI operates and leveraging its capabilities will be paramount.
User-Centric Testing: While AI can simulate user interactions, human testers remain indispensable for evaluating user experience (UX) and usability, especially for subjective aspects.
5. Integration into the CI/CD Pipeline:

Continuous Testing: AI facilitates seamless integration of testing into Continuous Integration/Continuous Delivery (CI/CD) pipelines, enabling automated and real-time execution of tests with every code change. This ensures immediate feedback and faster delivery cycles.
Shift-Right Testing: AI can extend testing into production environments, using post-deployment monitoring and analysis of real user behavior to identify issues that might not surface in pre-production testing.
6. Broader Applications:

Performance Testing Optimization: AI can simulate thousands of virtual users to conduct performance testing under various scenarios, identifying bottlenecks and optimizing system scalability.
Visual Testing: AI-powered visual testing solutions can detect even minor visual inconsistencies, ensuring pixel-perfect UIs across different devices and resolutions.
API Testing: AI can monitor API testing and help in identifying regressions by analyzing logs and test histories.
No-Code/Low-Code Automation: AI will make test automation more accessible to a wider range of users, including business analysts and manual testers, through intuitive no-code/low-code platforms.
Challenges to Consider:

Initial Investment and Setup: Implementing AI-driven testing solutions can require significant initial investment and setup.
Data Quality: AI algorithms depend on high-quality and sufficient training data.
False Positives/Negatives: While improving, AI models can still produce false positives or negatives, requiring human oversight.
Ethical AI Testing: Ensuring AI models are fair, unbiased, and compliant with regulations is crucial to prevent discriminatory outcomes.
In conclusion, the future of automation testing with AI impact is one of augmented intelligence, where AI acts as a powerful co-pilot, revolutionizing the speed, accuracy, and depth of testing. It's a shift from reactive bug detection to proactive quality assurance, ultimately leading to faster, more reliable, and higher-quality software releases. The role of the human tester will evolve from primarily executing repetitive tests to becoming a strategic quality engineer who leverages AI to achieve unprecedented levels of software quality.
'''
with open("sample.txt","w") as f:
  f.write(inp)

In [7]:
spm.SentencePieceTrainer.train(input="sample.txt",model_prefix='tockenizer',vocab_size=281,model_type='bpe',pad_id=0,unk_id=100,eos_id=1,bos_id=2,pad_piece='<pad>',unk_piece='<unk>',bos_piece='<s>',eos_piece='</s>')

In [8]:
sp=spm.SentencePieceProcessor()
sp.load('tockenizer.model')

True

In [9]:
text ='Selenium with java is widely used.'
tockens=sp.encode(text,out_type=str)
print(tockens)

['▁S', 'el', 'en', 'i', 'u', 'm', '▁with', '▁', 'j', 'a', 'v', 'a', '▁is', '▁w', 'i', 'de', 'ly', '▁us', 'ed', '.']


In [10]:
ids= sp.encode(text,out_type=int)
print(ids)

[126, 121, 8, 221, 230, 232, 160, 218, 271, 223, 237, 223, 87, 46, 221, 74, 56, 89, 72, 241]


In [12]:
decoded=sp.decode(160)
print(decoded)

with
