Skip to content

Commit

Permalink
add a token count library
Browse files Browse the repository at this point in the history
  • Loading branch information
alnutile committed May 31, 2023
1 parent 0365960 commit ba87f8a
Show file tree
Hide file tree
Showing 22 changed files with 106 additions and 392 deletions.
6 changes: 3 additions & 3 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
blank_issues_enabled: false
contact_links:
- name: Ask a question
url: https://github.com/:vendor_name/:package_name/discussions/new?category=q-a
url: https://github.com/sundance-solutions/larachain-token-count/discussions/new?category=q-a
about: Ask the community for help
- name: Request a feature
url: https://github.com/:vendor_name/:package_name/discussions/new?category=ideas
url: https://github.com/sundance-solutions/larachain-token-count/discussions/new?category=ideas
about: Share ideas for new features
- name: Report a security issue
url: https://github.com/:vendor_name/:package_name/security/policy
url: https://github.com/sundance-solutions/larachain-token-count/security/policy
about: Learn how to notify us for sensitive bugs
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Changelog

All notable changes to `:package_name` will be documented in this file.
All notable changes to `larachain-token-count` will be documented in this file.
2 changes: 1 addition & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
The MIT License (MIT)

Copyright (c) :vendor_name <author@domain.com>
Copyright (c) sundance-solutions <365385+alnutile@users.noreply.github.com>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
69 changes: 20 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,39 @@
# :package_description
# Quick helper to count tokens

[![Latest Version on Packagist](https://img.shields.io/packagist/v/:vendor_slug/:package_slug.svg?style=flat-square)](https://packagist.org/packages/:vendor_slug/:package_slug)
[![GitHub Tests Action Status](https://img.shields.io/github/actions/workflow/status/:vendor_slug/:package_slug/run-tests.yml?branch=main&label=tests&style=flat-square)](https://github.com/:vendor_slug/:package_slug/actions?query=workflow%3Arun-tests+branch%3Amain)
[![GitHub Code Style Action Status](https://img.shields.io/github/actions/workflow/status/:vendor_slug/:package_slug/fix-php-code-style-issues.yml?branch=main&label=code%20style&style=flat-square)](https://github.com/:vendor_slug/:package_slug/actions?query=workflow%3A"Fix+PHP+code+style+issues"+branch%3Amain)
[![Total Downloads](https://img.shields.io/packagist/dt/:vendor_slug/:package_slug.svg?style=flat-square)](https://packagist.org/packages/:vendor_slug/:package_slug)
<!--delete-->
---
This repo can be used to scaffold a Laravel package. Follow these steps to get started:
[![Latest Version on Packagist](https://img.shields.io/packagist/v/sundance-solutions/larachain-token-count.svg?style=flat-square)](https://packagist.org/packages/sundance-solutions/larachain-token-count)
[![GitHub Tests Action Status](https://img.shields.io/github/actions/workflow/status/sundance-solutions/larachain-token-count/run-tests.yml?branch=main&label=tests&style=flat-square)](https://github.com/sundance-solutions/larachain-token-count/actions?query=workflow%3Arun-tests+branch%3Amain)
[![GitHub Code Style Action Status](https://img.shields.io/github/actions/workflow/status/sundance-solutions/larachain-token-count/fix-php-code-style-issues.yml?branch=main&label=code%20style&style=flat-square)](https://github.com/sundance-solutions/larachain-token-count/actions?query=workflow%3A"Fix+PHP+code+style+issues"+branch%3Amain)
[![Total Downloads](https://img.shields.io/packagist/dt/sundance-solutions/larachain-token-count.svg?style=flat-square)](https://packagist.org/packages/sundance-solutions/larachain-token-count)

1. Press the "Use this template" button at the top of this repo to create a new repo with the contents of this skeleton.
2. Run "php ./configure.php" to run a script that will replace all placeholders throughout all the files.
3. Have fun creating your package.
4. If you need help creating a package, consider picking up our <a href="https://laravelpackage.training">Laravel Package Training</a> video course.
---
<!--/delete-->
This is where your description should go. Limit it to a paragraph or two. Consider adding a small example.
GPT-3 Approximate Token Counter in PHP

## Support us
This repository contains a PHP function that approximates the token count of a text string, following the tokenization rules used by OpenAI's GPT-3.

[<img src="https://github-ads.s3.eu-central-1.amazonaws.com/:package_name.jpg?t=1" width="419px" />](https://spatie.be/github-ad-click/:package_name)
GPT-3, an advanced language model developed by OpenAI, reads text in chunks called tokens. A token in GPT-3 can be as short as one character or as long as one word (e.g., 'a', 'apple'). For languages with more complex scripts (like Chinese, Japanese, etc.), one character can be multiple tokens. Spaces and punctuation are also considered separate tokens.

We invest a lot of resources into creating [best in class open source packages](https://spatie.be/open-source). You can support us by [buying one of our paid products](https://spatie.be/open-source/support-us).
The function provided here offers an approximation of how GPT-3 might tokenize a given string, counting words, spaces, and punctuation as separate tokens. This allows you to estimate the number of tokens in a text string without making an API call, which can be useful for monitoring usage or avoiding unnecessary costs.

We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on [our contact page](https://spatie.be/about-us). We publish all received postcards on [our virtual postcard wall](https://spatie.be/open-source/postcards).
Please note that this is a simplified approximation, and the actual tokenization may vary slightly in GPT-3's actual implementation. In particular, some words might be tokenized into multiple tokens if they contain special characters or are very long. Additionally, this method may not accurately tokenize languages other than English, especially those using non-Latin characters.

As of the last update in September 2021, OpenAI has not provided a public method for accurately counting tokens the way GPT-3 does. Therefore, this function is an estimation, not a guaranteed accurate count.

## Installation

You can install the package via composer:

```bash
composer require :vendor_slug/:package_slug
```

You can publish and run the migrations with:

```bash
php artisan vendor:publish --tag=":package_slug-migrations"
php artisan migrate
```

You can publish the config file with:

```bash
php artisan vendor:publish --tag=":package_slug-config"
```

This is the contents of the published config file:

```php
return [
];
```

Optionally, you can publish the views using

```bash
php artisan vendor:publish --tag=":package_slug-views"
composer require sundance-solutions/larachain-token-count
```

## Usage

```php
$variable = new VendorName\Skeleton();
echo $variable->echoPhrase('Hello, VendorName!');
use SundanceSolutions\LarachainTokenCount\Facades\LarachainTokenCount;

$text = "Your document text...";
$results = LarachainTokenCount::count($text);
expect($results)->toEqual(8);

```

## Testing
Expand All @@ -85,7 +56,7 @@ Please review [our security policy](../../security/policy) on how to report secu

## Credits

- [:author_name](https://github.com/:author_username)
- [Alfred Nutile](https://github.com/alnutile)
- [All Contributors](../../contributors)

## License
Expand Down
29 changes: 14 additions & 15 deletions composer.json
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
{
"name": ":vendor_slug/:package_slug",
"description": ":package_description",
"name": "sundance-solutions/larachain-token-count",
"description": "Quick helper to count tokens ",
"keywords": [
":vendor_name",
"sundance-solutions",
"laravel",
":package_slug"
"larachain-token-count"
],
"homepage": "https://github.com/:vendor_slug/:package_slug",
"homepage": "https://github.com/sundance-solutions/larachain-token-count",
"license": "MIT",
"authors": [
{
"name": ":author_name",
"email": "author@domain.com",
"name": "Alfred Nutile",
"email": "365385+alnutile@users.noreply.github.com",
"role": "Developer"
}
],
Expand All @@ -30,18 +30,17 @@
"pestphp/pest-plugin-laravel": "^2.0",
"phpstan/extension-installer": "^1.1",
"phpstan/phpstan-deprecation-rules": "^1.0",
"phpstan/phpstan-phpunit": "^1.0",
"spatie/laravel-ray": "^1.26"
"phpstan/phpstan-phpunit": "^1.0"
},
"autoload": {
"psr-4": {
"VendorName\\Skeleton\\": "src/",
"VendorName\\Skeleton\\Database\\Factories\\": "database/factories/"
"SundanceSolutions\\LarachainTokenCount\\": "src/",
"SundanceSolutions\\LarachainTokenCount\\Database\\Factories\\": "database/factories/"
}
},
"autoload-dev": {
"psr-4": {
"VendorName\\Skeleton\\Tests\\": "tests/"
"SundanceSolutions\\LarachainTokenCount\\Tests\\": "tests/"
}
},
"scripts": {
Expand All @@ -61,13 +60,13 @@
"extra": {
"laravel": {
"providers": [
"VendorName\\Skeleton\\SkeletonServiceProvider"
"SundanceSolutions\\LarachainTokenCount\\LarachainTokenCountServiceProvider"
],
"aliases": {
"Skeleton": "VendorName\\Skeleton\\Facades\\Skeleton"
"LarachainTokenCount": "SundanceSolutions\\LarachainTokenCount\\Facades\\LarachainTokenCount"
}
}
},
"minimum-stability": "dev",
"prefer-stable": true
}
}
6 changes: 6 additions & 0 deletions config/larachain-token-count.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
<?php

// config for SundanceSolutions/LarachainTokenCount
return [

];
6 changes: 0 additions & 6 deletions config/skeleton.php

This file was deleted.

0 comments on commit ba87f8a

Please sign in to comment.