Created by Muhid Qaiser 

Email : muhidqaiser02@gmail.com 

Linkedin : https://www.linkedin.com/in/muhid-qaiser/

Github : https://github.com/Muhid-Qaiser

# Splitting Json Data into Chunks using 

In [2]:
import json
import requests

api_url="https://api.coingecko.com/api/v3/coins/markets?vs_currency=usd"

json_data = requests.get(api_url).json()
json_data

[{'id': 'bitcoin',
  'symbol': 'btc',
  'name': 'Bitcoin',
  'image': 'https://coin-images.coingecko.com/coins/images/1/large/bitcoin.png?1696501400',
  'current_price': 87326,
  'market_cap': 1734666709251,
  'market_cap_rank': 1,
  'fully_diluted_valuation': 1734666709251,
  'total_volume': 30331812936,
  'high_24h': 88430,
  'low_24h': 86358,
  'price_change_24h': -58.09211586951278,
  'price_change_percentage_24h': -0.06648,
  'market_cap_change_24h': -1343300292.4060059,
  'market_cap_change_percentage_24h': -0.07738,
  'circulating_supply': 19841981.0,
  'total_supply': 19841981.0,
  'max_supply': 21000000.0,
  'ath': 108786,
  'ath_change_percentage': -19.60573,
  'ath_date': '2025-01-20T09:11:54.494Z',
  'atl': 67.81,
  'atl_change_percentage': 128876.00034,
  'atl_date': '2013-07-06T00:00:00.000Z',
  'roi': None,
  'last_updated': '2025-03-26T00:13:33.113Z'},
 {'id': 'ethereum',
  'symbol': 'eth',
  'name': 'Ethereum',
  'image': 'https://coin-images.coingecko.com/coins/images

### Split into Json Type Chunks

In [6]:
from langchain_text_splitters import RecursiveJsonSplitter

json_splitter = RecursiveJsonSplitter(max_chunk_size=200)
json_chunks = json_splitter.split_json(json_data[0])
json_chunks

[{'id': 'bitcoin',
  'symbol': 'btc',
  'name': 'Bitcoin',
  'image': 'https://coin-images.coingecko.com/coins/images/1/large/bitcoin.png?1696501400',
  'current_price': 87326,
  'market_cap': 1734666709251},
 {'market_cap_rank': 1,
  'fully_diluted_valuation': 1734666709251,
  'total_volume': 30331812936,
  'high_24h': 88430,
  'low_24h': 86358,
  'price_change_24h': -58.09211586951278},
 {'price_change_percentage_24h': -0.06648,
  'market_cap_change_24h': -1343300292.4060059,
  'market_cap_change_percentage_24h': -0.07738,
  'circulating_supply': 19841981.0,
  'total_supply': 19841981.0},
 {'max_supply': 21000000.0,
  'ath': 108786,
  'ath_change_percentage': -19.60573,
  'ath_date': '2025-01-20T09:11:54.494Z',
  'atl': 67.81,
  'atl_change_percentage': 128876.00034},
 {'atl_date': '2013-07-06T00:00:00.000Z',
  'roi': None,
  'last_updated': '2025-03-26T00:13:33.113Z'}]

Notice that the chunks are not in Document Type

### The Splitter can also Convert to Document Type to extract more information.

In [8]:

docs = json_splitter.create_documents(texts=[json_data[0]])
docs


[Document(metadata={}, page_content='{"id": "bitcoin", "symbol": "btc", "name": "Bitcoin", "image": "https://coin-images.coingecko.com/coins/images/1/large/bitcoin.png?1696501400", "current_price": 87326, "market_cap": 1734666709251}'),
 Document(metadata={}, page_content='{"market_cap_rank": 1, "fully_diluted_valuation": 1734666709251, "total_volume": 30331812936, "high_24h": 88430, "low_24h": 86358, "price_change_24h": -58.09211586951278}'),
 Document(metadata={}, page_content='{"price_change_percentage_24h": -0.06648, "market_cap_change_24h": -1343300292.4060059, "market_cap_change_percentage_24h": -0.07738, "circulating_supply": 19841981.0, "total_supply": 19841981.0}'),
 Document(metadata={}, page_content='{"max_supply": 21000000.0, "ath": 108786, "ath_change_percentage": -19.60573, "ath_date": "2025-01-20T09:11:54.494Z", "atl": 67.81, "atl_change_percentage": 128876.00034}'),
 Document(metadata={}, page_content='{"atl_date": "2013-07-06T00:00:00.000Z", "roi": null, "last_updated"

Now Json splitted into Document Chunks

### If we want to convert the output into String

In [10]:
text = json_splitter.split_text(json_data[0])
text

['{"id": "bitcoin", "symbol": "btc", "name": "Bitcoin", "image": "https://coin-images.coingecko.com/coins/images/1/large/bitcoin.png?1696501400", "current_price": 87326, "market_cap": 1734666709251}',
 '{"market_cap_rank": 1, "fully_diluted_valuation": 1734666709251, "total_volume": 30331812936, "high_24h": 88430, "low_24h": 86358, "price_change_24h": -58.09211586951278}',
 '{"price_change_percentage_24h": -0.06648, "market_cap_change_24h": -1343300292.4060059, "market_cap_change_percentage_24h": -0.07738, "circulating_supply": 19841981.0, "total_supply": 19841981.0}',
 '{"max_supply": 21000000.0, "ath": 108786, "ath_change_percentage": -19.60573, "ath_date": "2025-01-20T09:11:54.494Z", "atl": 67.81, "atl_change_percentage": 128876.00034}',
 '{"atl_date": "2013-07-06T00:00:00.000Z", "roi": null, "last_updated": "2025-03-26T00:13:33.113Z"}']