Skip to content

Commit

Permalink
Fix GlassDoor Country Vietnam(#122)
Browse files Browse the repository at this point in the history
  • Loading branch information
giga-sec committed Mar 4, 2024
1 parent db01bc6 commit a4f6851
Show file tree
Hide file tree
Showing 5 changed files with 28 additions and 48 deletions.
11 changes: 1 addition & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,15 +104,6 @@ JobPost
└── is_remote (bool)
```

### Exceptions

The following exceptions may be raised when using JobSpy:

* `LinkedInException`
* `IndeedException`
* `ZipRecruiterException`
* `GlassdoorException`

## Supported Countries for Job Searching

### **LinkedIn**
Expand Down Expand Up @@ -147,7 +138,7 @@ You can specify the following countries when searching on Indeed (use the exact
| South Korea | Spain* | Sweden | Switzerland* |
| Taiwan | Thailand | Turkey | Ukraine |
| United Arab Emirates | UK* | USA* | Uruguay |
| Venezuela | Vietnam | | |
| Venezuela | Vietnam* | | |


Glassdoor can only fetch 900 jobs from the endpoint we're using on a given search.
Expand Down
40 changes: 17 additions & 23 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "python-jobspy"
version = "1.1.46"
version = "1.1.47"
description = "Job scraper for LinkedIn, Indeed, Glassdoor & ZipRecruiter"
authors = ["Zachary Hampton <zachary@bunsly.com>", "Cullen Watson <cullen@bunsly.com>"]
homepage = "https://github.com/Bunsly/JobSpy"
Expand All @@ -17,8 +17,8 @@ beautifulsoup4 = "^4.12.2"
pandas = "^2.1.0"
NUMPY = "1.24.2"
pydantic = "^2.3.0"
html2text = "^2020.1.16"
tls-client = "^1.0.1"
markdownify = "^0.11.6"


[tool.poetry.group.dev.dependencies]
Expand Down
2 changes: 1 addition & 1 deletion src/jobspy/jobs/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ class Country(Enum):
USA = ("usa,us,united states", "www", "com")
URUGUAY = ("uruguay", "uy")
VENEZUELA = ("venezuela", "ve")
VIETNAM = ("vietnam", "vn")
VIETNAM = ("vietnam", "vn", "com")

# internal for ziprecruiter
US_CANADA = ("usa/ca", "www")
Expand Down
19 changes: 7 additions & 12 deletions src/jobspy/scrapers/utils.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
import re
import logging
import numpy as np
import re

import html2text
import tls_client
import numpy as np
import requests
import tls_client
from markdownify import markdownify as md
from requests.adapters import HTTPAdapter, Retry

from ..jobs import JobType

text_maker = html2text.HTML2Text()
logger = logging.getLogger("JobSpy")
logger.propagate = False
if not logger.handlers:
Expand All @@ -36,13 +35,9 @@ def count_urgent_words(description: str) -> int:

def markdown_converter(description_html: str):
if description_html is None:
return ""
text_maker.ignore_links = False
try:
markdown = text_maker.handle(description_html)
return markdown.strip()
except AssertionError as e:
return ""
return None
markdown = md(description_html)
return markdown.strip()


def extract_emails_from_text(text: str) -> list[str] | None:
Expand Down

0 comments on commit a4f6851

Please sign in to comment.