GitHub - Deeksha-Gokarn/Descriptive-Statistical-Analysis-using-Python: Constant Learner

The raw unstrctured big data is considered for this repo. The dataset is a webpage produced from automotive testing. The resulted webpage constitutes series of HTML and XML contents and are humongous. The XML and HTML contents are initially parsed in the data pre- processing stage as shown in Testfall.py, Teschritte.py, Inhalt.py. The parsed data is then stored in the database. The data extracted from the database is in a transformed structured format as shown in Database file. This data is then loaded into a data frame. The above process is considered ETL. The resulted structured data is efficient and is also made more human-readable. Statistical analysis specifically descriptive analysis is then performed on the structured data shown in . Additionally, the error patterns, total error counts, and repeated testing statistics are made with the help of data visualization using matplotlib histograms as shown in matplotlib file.

THE OUTPUT OF ALL THE FILES IN THIS REPO ARE NOT UPLOADED SINCE IT CONTAINS PRIVATE/PERSONAL DATA

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Database.py		Database.py
Inhalt.py		Inhalt.py
README.md		README.md
Testfall.py		Testfall.py
Testschritte.py		Testschritte.py
matplolib.py		matplolib.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Deeksha-Gokarn/Descriptive-Statistical-Analysis-using-Python

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages