Skip to content

Commit 5953521

Browse files
committed
2 parents a31c8e2 + 6938240 commit 5953521

File tree

26 files changed

+1755
-112
lines changed

26 files changed

+1755
-112
lines changed

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ $ git add .
4747
- To commit give a descriptive message for the convenience of reveiwer by:
4848
```
4949
# This message get associated with all files you have changed
50-
$ git commit -m 'message
50+
$ git commit -m "message"
5151
```
5252
- **NOTE**: A PR should have only one commit. Multiple commits should be squashed.
5353
## Step 6 : Work Remotely

Desktop News Notifier/Readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
- On running the script it gives a notification of top 10 news
88

9-
## Select Stocks by volume Increase Instructions: 👨🏻‍💻
9+
## Desktop Notifier Instructions: 👨🏻‍💻
1010

1111
### Step 1:
1212

Num-Plate-Detector/car.jpeg

-10 Bytes
Loading

PDF2Text/Readme.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# <b>PDF2Text</b>
2+
3+
[![forthebadge](https://forthebadge.com/images/badges/made-with-python.svg)](https://forthebadge.com)
4+
5+
## PDF2Text Functionalities : 🚀
6+
7+
- Converts PDF file to a text file
8+
9+
## PDF2Text Instructions: 👨🏻‍💻
10+
11+
### Step 1:
12+
13+
Open Termnial 💻
14+
15+
### Step 2:
16+
17+
Locate to the directory where python file is located 📂
18+
19+
### Step 3:
20+
21+
Run the command: python script.py/python3 script.py 🧐
22+
23+
### Step 4:
24+
25+
Sit back and Relax. Let the Script do the Job. ☕
26+
27+
## Requirements
28+
29+
- PyPDF2
30+
31+
## DEMO
32+
33+
1) Select the PDF File
34+
35+
![Screenshot (127)](https://user-images.githubusercontent.com/60662775/112711916-ff837580-8ef1-11eb-998b-1c96fec1de2f.png)
36+
37+
2) Place the PDF File in the script folder
38+
39+
![Screenshot (128)](https://user-images.githubusercontent.com/60662775/112711924-12964580-8ef2-11eb-8aec-ef33fb3d19e1.png)
40+
41+
3) Now open cmd
42+
43+
![Screenshot (129)](https://user-images.githubusercontent.com/60662775/112711947-41142080-8ef2-11eb-80bb-71539b301b4e.png)
44+
45+
4) Enter the input like the PDF File path and number of pages
46+
47+
![Screenshot (131)](https://user-images.githubusercontent.com/60662775/112711986-846e8f00-8ef2-11eb-9cbd-cc6dc204b6b3.png)
48+
49+
5) The PDF File will be converted to text file (OUTPUT)
50+
51+
![Screenshot (132)](https://user-images.githubusercontent.com/60662775/112712000-92bcab00-8ef2-11eb-9191-252d6e6c526d.png)
52+
53+
54+
## Author
55+
56+
Amit Kumar Mishra
57+

PDF2Text/script.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
import PyPDF2
2+
3+
pdf = input(r"Enter the path of PDF file: ")
4+
n = int(input("Enter number of pages: "))
5+
6+
page = PyPDF2.PdfFileReader(pdf)
7+
for i in range(n):
8+
st=""
9+
st += page.getPage(i).extractText()
10+
11+
with open(f'./PDF2Text/text{i}.txt','w') as f:
12+
f.write(st)
13+

Remove_POS_hindi_text/Input.png

67.2 KB
Loading

Remove_POS_hindi_text/Only_Hindi.txt

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

Remove_POS_hindi_text/Output.png

115 KB
Loading

Remove_POS_hindi_text/README.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Package/Script Name
2+
3+
Short description of package/script
4+
5+
-->Package installed- NLKT
6+
- NLTK stands for 'Natural Language Tool Kit'. It consists of the most common algorithms such as tokenizing, part-of-speech tagging, stemming, sentiment analysis, topic segmentation, and named entity recognition. NLTK helps the computer to analysis, preprocess, and understand the written text.
7+
8+
9+
## Setup instructions
10+
11+
--> Explanation on how to setup and run your package/script locally
12+
- simply import the NLKT package by writing 'import nlkt' in first line of your script.
13+
- To run the script locally save the 'Tagged_Hindi_Corpus.txt' file at your favourable location.
14+
- In code, in fp=open(r"..."), give the location of your saved file as mentioned in previous step.
15+
- In code, in fd=open(r"..."), give the location where you want the file with only Hindi text after removal of POS.
16+
- Note that for this script, I have run the script therefore only_hindi.txt file already exists. Before executing your script make sure you delete 'only_hindi.txt' file and see it after running the script.
17+
- Run the script with "python hindi_POS_tag_removal.py OR python <name of your py file.py>"
18+
- You will be able to see the file with only Hindi text.
19+
20+
21+
## Detailed explanation of script, if needed
22+
23+
Script is written as follows:
24+
25+
- Open the hindi_tagged_corpus file.
26+
- Data tokenization.
27+
- Create 2 empty lists.
28+
- To get all categories from POS.
29+
- To get all the hindi words.
30+
- To concatenate the words.
31+
- To write the words in only_hindi file.
32+
33+
## Input
34+
35+
![Image](C:\Users\ZAVERI SANYA\Desktop\Amazing-Python-Scripts\Remove_POS_hindi_text\Input.png)
36+
37+
## Output
38+
![Image](C:\Users\ZAVERI SANYA\Desktop\Amazing-Python-Scripts\Remove_POS_hindi_text\Output.png)
39+
40+
41+
## Author(s)
42+
43+
- This code is written by Sanya Devansh Zaveri. [https://github.com/zaverisanya]
44+
45+
## Disclaimers, if any
46+
47+
There are no disclaimers for this script.

Remove_POS_hindi_text/Tagged_Hindi_Corpus.txt

Lines changed: 857 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)