Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 66 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,48 @@
<p align="center"><img src="https://github.com/user-attachments/assets/d28934e1-a1ab-4b6e-9d12-d64067a65a60"><br>An unofficial Python SDK for <a href="https://searchcode.com">SearchCode</a>.<br><i>Search 75 billion lines of code from 40 million projects</i></p>
<p align="center"><img src="https://github.com/user-attachments/assets/d28934e1-a1ab-4b6e-9d12-d64067a65a60"><br>Python SDK for <a href="https://searchcode.com">Searchcode</a>.<br><i>Search 75 billion lines of code from 40 million projects</i></p>
<p align="center"></p>
<p align="center">
<a href="https://github.com/knewkarma-io/knewkarma"><img alt="Code Style" src="https://img.shields.io/badge/code%20style-black-000000?logo=github&link=https%3A%2F%2Fgithub.com%2Frly0nheart%2Fknewkarma"></a>
</p>

## Table Of Contents
* [Code_search](#code_search)
* [Example (Without Filters)](#example-without-filters)
* [Example Language Filter (Java and Javascript)](#example-language-filter-java-and-javascript)
* [Example Source Filter (Bitbucket and CodePlex)](#example-source-filter-bitbucket-and-codeplex)
* [Example Lines of Code Filter (Between 500 and 1000)](#example-lines-of-code-filter-between-500-and-1000)
* [Example (JSONP)](#example-jsonp)
* [Response Attribute Definitions](#response-attribute-definitions)
* [code_result](#code_result)
* [Example](#example)
* [related_results](#related_results)
* [Example](#example-1)
* [Response Attribute Definitions](#response-attribute-definitions-1)
* [About Searchcode](#about-searchcode)
* [Credit](#credit)

***

## code_search
## Installation

```bash

pip install searchcode

```

## Documentation

### Code Search

Queries the code index and returns at most 100 results.

> [!TIP]
All filters supported by searchcode are available. These include
`sources`, `languages` and `lines_of_code`. These work in the same way that the main page works; See the examples below for how to use them.
#### Params

> [!TIP]
To fetch all results for a given query, keep incrementing the `page` parameter until you get a page with an empty
results list.
- `query`: Search term (required).
- The following filters are textual and can be added into query directly
- Filter by file extention **ext:EXTENTION** e.g., _"gsub ext:erb"_
- Filter by language **lang:LANGUAGE** e.g., _"import lang:python"_
- Filter by repository **repo:REPONAME** e.g., _"float Q_rsqrt repo:quake"_
- Filter by user/repository **repo:USERNAME/REPONAME** e.g., _"batf repo:boyter/batf"_
- `page`: Result page starting at 0 through to 49
- `per_page`: Number of results wanted per page (max 100).
- `languages`: List of programming languages to filter by.
- `sources`: List of code sources (e.g., GitHub, BitBucket).
- `lines_of_code_gt`: Filter to sources with greater lines of code than supplied int. Valid values 0 to 10000.
- `lines_of_code_lt`: Filter to sources with less lines of code than supplied int. Valid values 0 to 10000.
- `callback`: Callback function (JSONP only)

> [!IMPORTANT]
If the results list is empty, then this indicates that you have reached the end of the available results.
> If the results list is empty, then this indicates that you have reached the end of the available results.

> To fetch all results for a given query, keep incrementing `page` parameter until you get a page with an empty results
> list.

### Example (Without Filters):
#### Code Search Without Filters

```python

import searchcode as sc

search = sc.code_search(query="test")
Expand All @@ -49,9 +51,10 @@ for result in search.results:
print(result)
```

### Example Language Filter (Java and Javascript):
#### Filter by Language (Java and JavaScript)

```python

import searchcode as sc

search = sc.code_search(query="test", languages=["Java", "JavaScript"])
Expand All @@ -60,9 +63,10 @@ for result in search.results:
print(result.language)
```

### Example Source Filter (Bitbucket and CodePlex):
#### Filter by Source (BitBucket and CodePlex)

```python

import searchcode as sc

search = sc.code_search(query="test", sources=["BitBucket", "CodePlex"])
Expand All @@ -71,29 +75,28 @@ for result in search.results:
print(result.filename)
```

### Example Lines of Code Filter (Between 500 and 1000):
#### Filter by Lines of Code (Between 500 and 1000)

```python

import searchcode as sc

search = sc.code_search(query="test", lines_of_code=500, lines_of_code2=1000)
search = sc.code_search(query="test", lines_of_code_gt=500, lines_of_code_lt=1000)

for result in search.results:
print(result.linescount)
print(result)
```

### Example (JSONP):
#### With Callback Function (JSONP only)

```python
import searchcode as sc

search = sc.code_search(query="soup", page=1, callback="myCallback")

for result in search.results:
print(result)
search = sc.code_search(query="test", callback="myCallback")
print(search)
```

### Response Attribute Definitions
#### Response Attribute Definitions

| Attribute | Description |
|----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Expand Down Expand Up @@ -124,39 +127,39 @@ for result in search.results:
| **md5hash** | Calculated MD5 hash of the file's contents. |
| **lines** | Contains line numbers and lines which match the `searchterm`. Lines immediately before and after the match are included. If only the filename matches, up to the first 15 lines of the file are returned. |

## code_result
### Code Result

Returns the raw data from a code file given the code id which can be found as the `id` in a code search result.

Returns the raw data from a code file given the code id which can be found as the id in a code search result.
#### Params

### Example:
- `_id`: Unique identifier for the code file (required).

```python

import searchcode as sc

result = sc.code_result(id=4061576)
print(result)
code = sc.code_result(4061576)
print(code)
```

> Returns raw data from the code file.

## related_results
### Related Results

Returns an array of results given a searchcode unique code id which are considered to be duplicates.
Returns an array of results given a searchcode unique code id which are considered to be duplicates.

> [!IMPORTANT]
The matching is
slightly fuzzy allowing so that small differences between files are ignored.
#### Params

### Example:
- `_id`: Unique identifier for the code file (required).

```python

import searchcode as sc

related = sc.related_results(id=4061576)
related = sc.related_results(4061576)
print(related)
```

### Response Attribute Definitions
#### Response Attribute Definitions

| Attribute | Description |
|----------------|------------------------------------------------------------------------------------------|
Expand All @@ -172,8 +175,13 @@ print(related)

## About Searchcode

Read more about searchcode [here](https://searchcode.com/about/)
Searchcode is a simple, comprehensive source code search engine that indexes billions of lines of code from open-source
projects,
helping you find real world examples of functions, API's and libraries in 243 languages across 10+ public code sources.

[Learn more](https://searchcode.com/about)

## Credit
## Acknowledgements

Special thanks to [Ben Boyter](https://boyter.org/about/), developer of [searchcode.com](https://searchcode.com)
This SDK is developed and maintained by [Richard Mwewa](https://gravatar.com/rly0nheart), in collaboration
with [Ben Boyter](https://boyter.org/about/), the creator of [Searchcode.com](https://searchcode.com).
6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
[tool.poetry]
name = "searchcode"
version = "0.1.1"
description = "An unofficial Python SDK for Searchcode."
version = "0.2.0"
description = "Python SDK for Searchcode."
authors = ["Richard Mwewa <rly0nheart@duck.com>"]
license = "GPL-3.0+ License"
license = "GPLv3+"
readme = "README.md"
homepage = "https://searchcode.com"
repository = "https://github.com/rly0nheart/searchcode-sdk"
Expand Down
80 changes: 38 additions & 42 deletions searchcode/_main.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,53 +17,48 @@


def _get_response(
endpoint: str,
headers: Optional[Dict] = None,
params: Optional[List[Tuple[str, str]]] = None,
endpoint: str, params: Optional[List[Tuple[str, str]]] = None, **kwargs
) -> Union[Dict, List, str]:
"""
Sends a GET request to the specified endpoint with the given headers and parameters.

:param endpoint: The API endpoint to send the request to.
:type endpoint: str
:param headers: Optional HTTP headers to include in the request.
:type headers: Optional[Dict]
:param params: Optional list of query parameters as key-value tuples.
:type params: Optional[List[Tuple[str, str]]]
:return: The parsed JSON response, which could be a dictionary, list, or string.
:rtype: Union[Dict, List, str]
:raises Exception: If the request fails or the server returns an error.
"""

try:
response = requests.get(url=endpoint, params=params, headers=headers)
response.raise_for_status()
return response.json()
response = requests.get(url=endpoint, params=params)
response.raise_for_status()
return response.text if kwargs.get("is_callback") else response.json()

except Exception as e:
raise e


def _object_to_namespace(
obj: Union[List[Dict], Dict]
def _response_to_namespace_obj(
response: Union[List[Dict], Dict]
) -> Union[List[SimpleNamespace], SimpleNamespace, List[Dict], Dict]:
"""
Recursively converts dictionaries and lists of dictionaries into SimpleNamespace objects.
Recursively converts the API response into a SimpleNamespace object(s).

:param obj: The object to convert, either a dictionary or a list of dictionaries.
:type obj: Union[List[Dict], Dict]
:param response: The object to convert, either a dictionary or a list of dictionaries.
:type response: Union[List[Dict], Dict]
:return: A SimpleNamespace object or list of SimpleNamespace objects.
:rtype: Union[List[SimpleNamespace], SimpleNamespace, None]
"""

if isinstance(obj, Dict):
if isinstance(response, Dict):
return SimpleNamespace(
**{key: _object_to_namespace(obj=value) for key, value in obj.items()}
**{
key: _response_to_namespace_obj(response=value)
for key, value in response.items()
}
)
elif isinstance(obj, List):
return [_object_to_namespace(obj=item) for item in obj]
elif isinstance(response, List):
return [_response_to_namespace_obj(response=item) for item in response]
else:
return obj
return response


def code_search(
Expand All @@ -72,10 +67,10 @@ def code_search(
per_page: int = 100,
languages: Optional[List[CODE_LANGUAGES]] = None,
sources: Optional[List[CODE_SOURCES]] = None,
lines_of_code: Optional[int] = None,
lines_of_code2: Optional[int] = None,
lines_of_code_gt: Optional[int] = None,
lines_of_code_lt: Optional[int] = None,
callback: Optional[str] = None,
) -> SimpleNamespace:
) -> Union[SimpleNamespace, str]:
"""
Searches and returns code snippets matching the query.

Expand All @@ -98,10 +93,10 @@ def code_search(
:param sources: Allows filtering to sources supplied by return types.
Supply multiple to filter to multiple sources.
:type sources: Optional[List[CODE_SOURCES]]
:param lines_of_code: Filter to sources with greater lines of code then supplied int. Valid values 0 to 10000.
:type lines_of_code: int
:param lines_of_code2: Filter to sources with fewer lines of code then supplied int. Valid values 0 to 10000.
:type lines_of_code2: int
:param lines_of_code_gt: Filter to sources with greater lines of code than supplied int. Valid values 0 to 10000.
:type lines_of_code_gt: int
:param lines_of_code_lt: Filter to sources with fewer lines of code than supplied int. Valid values 0 to 10000.
:type lines_of_code_lt: int
:param callback: Callback function (JSONP only)
:type callback: str
:return: The search results as a SimpleNamespace object.
Expand All @@ -112,47 +107,48 @@ def code_search(
source_ids = [] if not sources else get_source_ids(source_names=sources)

response = _get_response(
endpoint=f"{_BASE_API_ENDPOINT}/codesearch_I/",
endpoint=f"{_BASE_API_ENDPOINT}/{'jsonp_codesearch_I' if callback else 'codesearch_I'}/",
params=[
("q", query),
("p", page),
("per_page", per_page),
("loc", lines_of_code),
("loc2", lines_of_code2),
("loc", lines_of_code_gt),
("loc2", lines_of_code_lt),
("callback", callback),
*[("lan", language_id) for language_id in language_ids],
*[("src", source_id) for source_id in source_ids],
],
is_callback=callback,
)

return _object_to_namespace(obj=response)
return _response_to_namespace_obj(response=response)


def code_result(id: int) -> SimpleNamespace:
def code_result(_id: int) -> SimpleNamespace:
"""
Returns the raw data from a code file given the code ID which can be found as the `id` in a code search result.

:param id: The unique identifier of the code result.
:type id: int
:param _id: The unique identifier of the code result.
:type _id: int
:return: The code result details as a SimpleNamespace object.
:rtype: SimpleNamespace
"""

response = _get_response(endpoint=f"{_BASE_API_ENDPOINT}/result/{id}")
response = _get_response(endpoint=f"{_BASE_API_ENDPOINT}/result/{_id}")
return response.get("code")


def related_results(id: int) -> SimpleNamespace:
def related_results(_id: int) -> SimpleNamespace:
"""
Returns an array of results given a searchcode unique code id which are considered to be duplicates.

The matching is slightly fuzzy allowing so that small differences between files are ignored.

:param id: The unique identifier of the code result.
:type id: int
:param _id: The unique identifier of the code result.
:type _id: int
:return: A list of related results as a SimpleNamespace object.
:rtype: SimpleNamespace
"""

response = _get_response(endpoint=f"{_BASE_API_ENDPOINT}/related_results/{id}")
return _object_to_namespace(obj=response)
response = _get_response(endpoint=f"{_BASE_API_ENDPOINT}/related_results/{_id}")
return _response_to_namespace_obj(response=response)
Loading
Loading