Skip to content

Commit

Permalink
fix some type definition in the Reader and add more support to create…
Browse files Browse the repository at this point in the history
… Reader (microsoft#93)

* remove dup code

* Update Readme.md

* Fix DataSet GNU compile fail bug

* fix GNU Windows align alloc bugs

* add copyright in each file

* fix copy right in dataset

* change kdt distance judgement

* change code structure and add more wrappers

* Update docs

* fix search result

* change IndexBuilder to support binary input data

* temp remove java related projects

* remove javaclient and javacore from the windows build

* Fix SetData issue

* Add vector record count and dimension for reuse and debug

* change default parameter definition

* add uint8 support

* small fix for cosine distance of uint8

* fix AVX distance calculation epu8

* update readme

* Update DistanceUtils.h

* fix python wrapper cannot load larger than 4G memory error

* try to add C# wrapper

* fix owner of C# wrapper

* add C# cmake support

* fix byte array copy

* fix tab to space

* Try to make shared_ptr<T> as Array template

* fix copy

* add Parameters documents

* remove tbb dependency

* fix concurrent_set

* fix gcc 5.x cannot support shared_mutex

* move concurrentset to Helper folder and change find to contains

* Update README.md

* try to use shared_lock to replace lock and unlock, try to use block to manage the increased memory

* fix filling -1

* fix initialization

* change to memset

* add CLR CoreInterface for managed dll

* try to reserve incBlocks capacity

* fix return ErrorCode for AddBatch in Dataset.h

* change return type to ErrorCode for AddBatch

* remove the tbb dependency (microsoft#71) (microsoft#10)

* remove dup code

* Update Readme.md

* Fix DataSet GNU compile fail bug

* fix GNU Windows align alloc bugs

* add copyright in each file

* fix copy right in dataset

* change kdt distance judgement

* change code structure and add more wrappers

* Update docs

* fix search result

* change IndexBuilder to support binary input data

* temp remove java related projects

* remove javaclient and javacore from the windows build

* Fix SetData issue

* Add vector record count and dimension for reuse and debug

* change default parameter definition

* add uint8 support

* small fix for cosine distance of uint8

* fix AVX distance calculation epu8

* update readme

* Update DistanceUtils.h

* fix python wrapper cannot load larger than 4G memory error

* try to add C# wrapper

* fix owner of C# wrapper

* add C# cmake support

* fix byte array copy

* fix tab to space

* Try to make shared_ptr<T> as Array template

* fix copy

* add Parameters documents

* remove tbb dependency

* fix concurrent_set

* fix gcc 5.x cannot support shared_mutex

* move concurrentset to Helper folder and change find to contains

* Update README.md

* try to use shared_lock to replace lock and unlock, try to use block to manage the increased memory

* fix filling -1

* fix initialization

* change to memset

* add CLR CoreInterface for managed dll

* try to reserve incBlocks capacity

* fix return ErrorCode for AddBatch in Dataset.h

* change return type to ErrorCode for AddBatch

* fix type definition

* change incremental update design

* fix all type

* fix debug mode memory delete assert

* add deletePercentageForRefine judgement

* add dump and load from byte array

* add dump and load from byte array

* fix getNumThreads

* fix loadindex and add index bugs

* Update AlgoTest to add metamapping test

* fix compling error in g++7

* fix largest cluster cannot be split during clustering

* update fresh ANN implementation (microsoft#85) (microsoft#12)

* remove dup code

* Update Readme.md

* Fix DataSet GNU compile fail bug

* fix GNU Windows align alloc bugs

* add copyright in each file

* fix copy right in dataset

* change kdt distance judgement

* change code structure and add more wrappers

* Update docs

* fix search result

* change IndexBuilder to support binary input data

* temp remove java related projects

* remove javaclient and javacore from the windows build

* Fix SetData issue

* Add vector record count and dimension for reuse and debug

* change default parameter definition

* add uint8 support

* small fix for cosine distance of uint8

* fix AVX distance calculation epu8

* update readme

* Update DistanceUtils.h

* fix python wrapper cannot load larger than 4G memory error

* try to add C# wrapper

* fix owner of C# wrapper

* add C# cmake support

* fix byte array copy

* fix tab to space

* Try to make shared_ptr<T> as Array template

* fix copy

* add Parameters documents

* remove tbb dependency

* fix concurrent_set

* fix gcc 5.x cannot support shared_mutex

* move concurrentset to Helper folder and change find to contains

* Update README.md

* try to use shared_lock to replace lock and unlock, try to use block to manage the increased memory

* fix filling -1

* fix initialization

* change to memset

* add CLR CoreInterface for managed dll

* try to reserve incBlocks capacity

* fix return ErrorCode for AddBatch in Dataset.h

* change return type to ErrorCode for AddBatch

* remove the tbb dependency (microsoft#71) (microsoft#10)

* remove dup code

* Update Readme.md

* Fix DataSet GNU compile fail bug

* fix GNU Windows align alloc bugs

* add copyright in each file

* fix copy right in dataset

* change kdt distance judgement

* change code structure and add more wrappers

* Update docs

* fix search result

* change IndexBuilder to support binary input data

* temp remove java related projects

* remove javaclient and javacore from the windows build

* Fix SetData issue

* Add vector record count and dimension for reuse and debug

* change default parameter definition

* add uint8 support

* small fix for cosine distance of uint8

* fix AVX distance calculation epu8

* update readme

* Update DistanceUtils.h

* fix python wrapper cannot load larger than 4G memory error

* try to add C# wrapper

* fix owner of C# wrapper

* add C# cmake support

* fix byte array copy

* fix tab to space

* Try to make shared_ptr<T> as Array template

* fix copy

* add Parameters documents

* remove tbb dependency

* fix concurrent_set

* fix gcc 5.x cannot support shared_mutex

* move concurrentset to Helper folder and change find to contains

* Update README.md

* try to use shared_lock to replace lock and unlock, try to use block to manage the increased memory

* fix filling -1

* fix initialization

* change to memset

* add CLR CoreInterface for managed dll

* try to reserve incBlocks capacity

* fix return ErrorCode for AddBatch in Dataset.h

* change return type to ErrorCode for AddBatch

* fix type definition

* change incremental update design

* fix all type

* fix debug mode memory delete assert

* add deletePercentageForRefine judgement

* add dump and load from byte array

* add dump and load from byte array

* fix getNumThreads

* fix loadindex and add index bugs

* Update AlgoTest to add metamapping test

* fix compling error in g++7

* fix largest cluster cannot be split during clustering

* fix maxcluster is -1 bug

* fix Reader type definition and add more support

* fix maxcluster is -1 bug (microsoft#91) (microsoft#14)

* remove dup code

* Update Readme.md

* Fix DataSet GNU compile fail bug

* fix GNU Windows align alloc bugs

* add copyright in each file

* fix copy right in dataset

* change kdt distance judgement

* change code structure and add more wrappers

* Update docs

* fix search result

* change IndexBuilder to support binary input data

* temp remove java related projects

* remove javaclient and javacore from the windows build

* Fix SetData issue

* Add vector record count and dimension for reuse and debug

* change default parameter definition

* add uint8 support

* small fix for cosine distance of uint8

* fix AVX distance calculation epu8

* update readme

* Update DistanceUtils.h

* fix python wrapper cannot load larger than 4G memory error

* try to add C# wrapper

* fix owner of C# wrapper

* add C# cmake support

* fix byte array copy

* fix tab to space

* Try to make shared_ptr<T> as Array template

* fix copy

* add Parameters documents

* remove tbb dependency

* fix concurrent_set

* fix gcc 5.x cannot support shared_mutex

* move concurrentset to Helper folder and change find to contains

* Update README.md

* try to use shared_lock to replace lock and unlock, try to use block to manage the increased memory

* fix filling -1

* fix initialization

* change to memset

* add CLR CoreInterface for managed dll

* try to reserve incBlocks capacity

* fix return ErrorCode for AddBatch in Dataset.h

* change return type to ErrorCode for AddBatch

* remove the tbb dependency (microsoft#71) (microsoft#10)

* remove dup code

* Update Readme.md

* Fix DataSet GNU compile fail bug

* fix GNU Windows align alloc bugs

* add copyright in each file

* fix copy right in dataset

* change kdt distance judgement

* change code structure and add more wrappers

* Update docs

* fix search result

* change IndexBuilder to support binary input data

* temp remove java related projects

* remove javaclient and javacore from the windows build

* Fix SetData issue

* Add vector record count and dimension for reuse and debug

* change default parameter definition

* add uint8 support

* small fix for cosine distance of uint8

* fix AVX distance calculation epu8

* update readme

* Update DistanceUtils.h

* fix python wrapper cannot load larger than 4G memory error

* try to add C# wrapper

* fix owner of C# wrapper

* add C# cmake support

* fix byte array copy

* fix tab to space

* Try to make shared_ptr<T> as Array template

* fix copy

* add Parameters documents

* remove tbb dependency

* fix concurrent_set

* fix gcc 5.x cannot support shared_mutex

* move concurrentset to Helper folder and change find to contains

* Update README.md

* try to use shared_lock to replace lock and unlock, try to use block to manage the increased memory

* fix filling -1

* fix initialization

* change to memset

* add CLR CoreInterface for managed dll

* try to reserve incBlocks capacity

* fix return ErrorCode for AddBatch in Dataset.h

* change return type to ErrorCode for AddBatch

* fix type definition

* change incremental update design

* fix all type

* fix debug mode memory delete assert

* add deletePercentageForRefine judgement

* add dump and load from byte array

* add dump and load from byte array

* fix getNumThreads

* fix loadindex and add index bugs

* Update AlgoTest to add metamapping test

* fix compling error in g++7

* fix largest cluster cannot be split during clustering

* update fresh ANN implementation (microsoft#85) (microsoft#12)

* remove dup code

* Update Readme.md

* Fix DataSet GNU compile fail bug

* fix GNU Windows align alloc bugs

* add copyright in each file

* fix copy right in dataset

* change kdt distance judgement

* change code structure and add more wrappers

* Update docs

* fix search result

* change IndexBuilder to support binary input data

* temp remove java related projects

* remove javaclient and javacore from the windows build

* Fix SetData issue

* Add vector record count and dimension for reuse and debug

* change default parameter definition

* add uint8 support

* small fix for cosine distance of uint8

* fix AVX distance calculation epu8

* update readme

* Update DistanceUtils.h

* fix python wrapper cannot load larger than 4G memory error

* try to add C# wrapper

* fix owner of C# wrapper

* add C# cmake support

* fix byte array copy

* fix tab to space

* Try to make shared_ptr<T> as Array template

* fix copy

* add Parameters documents

* remove tbb dependency

* fix concurrent_set

* fix gcc 5.x cannot support shared_mutex

* move concurrentset to Helper folder and change find to contains

* Update README.md

* try to use shared_lock to replace lock and unlock, try to use block to manage the increased memory

* fix filling -1

* fix initialization

* change to memset

* add CLR CoreInterface for managed dll

* try to reserve incBlocks capacity

* fix return ErrorCode for AddBatch in Dataset.h

* change return type to ErrorCode for AddBatch

* remove the tbb dependency (microsoft#71) (microsoft#10)

* remove dup code

* Update Readme.md

* Fix DataSet GNU compile fail bug

* fix GNU Windows align alloc bugs

* add copyright in each file

* fix copy right in dataset

* change kdt distance judgement

* change code structure and add more wrappers

* Update docs

* fix search result

* change IndexBuilder to support binary input data

* temp remove java related projects

* remove javaclient and javacore from the windows build

* Fix SetData issue

* Add vector record count and dimension for reuse and debug

* change default parameter definition

* add uint8 support

* small fix for cosine distance of uint8

* fix AVX distance calculation epu8

* update readme

* Update DistanceUtils.h

* fix python wrapper cannot load larger than 4G memory error

* try to add C# wrapper

* fix owner of C# wrapper

* add C# cmake support

* fix byte array copy

* fix tab to space

* Try to make shared_ptr<T> as Array template

* fix copy

* add Parameters documents

* remove tbb dependency

* fix concurrent_set

* fix gcc 5.x cannot support shared_mutex

* move concurrentset to Helper folder and change find to contains

* Update README.md

* try to use shared_lock to replace lock and unlock, try to use block to manage the increased memory

* fix filling -1

* fix initialization

* change to memset

* add CLR CoreInterface for managed dll

* try to reserve incBlocks capacity

* fix return ErrorCode for AddBatch in Dataset.h

* change return type to ErrorCode for AddBatch

* fix type definition

* change incremental update design

* fix all type

* fix debug mode memory delete assert

* add deletePercentageForRefine judgement

* add dump and load from byte array

* add dump and load from byte array

* fix getNumThreads

* fix loadindex and add index bugs

* Update AlgoTest to add metamapping test

* fix compling error in g++7

* fix largest cluster cannot be split during clustering

* fix maxcluster is -1 bug

* move threadPool init into DefaultReader

* try to move VectorsetReader into CordLibrary

* fix bktree cluster split issue
  • Loading branch information
MaggieQi committed Aug 20, 2019
1 parent d39081b commit e574ee0
Show file tree
Hide file tree
Showing 16 changed files with 179 additions and 145 deletions.
4 changes: 4 additions & 0 deletions AnnService/CoreLibrary.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,8 @@
<ClInclude Include="inc\Core\Common\RelativeNeighborhoodGraph.h" />
<ClInclude Include="inc\Core\Common\BKTree.h" />
<ClInclude Include="inc\Core\Common\KDTree.h" />
<ClInclude Include="inc\Helper\VectorSetReader.h" />
<ClInclude Include="inc\Helper\VectorSetReaders\DefaultReader.h" />
</ItemGroup>
<ItemGroup>
<ClCompile Include="src\Core\BKT\BKTIndex.cpp" />
Expand All @@ -178,6 +180,8 @@
<ClCompile Include="src\Helper\CommonHelper.cpp" />
<ClCompile Include="src\Helper\Concurrent.cpp" />
<ClCompile Include="src\Helper\SimpleIniReader.cpp" />
<ClCompile Include="src\Helper\VectorSetReader.cpp" />
<ClCompile Include="src\Helper\VectorSetReaders\DefaultReader.cpp" />
</ItemGroup>
<ItemGroup>
<None Include="packages.config" />
Expand Down
18 changes: 18 additions & 0 deletions AnnService/CoreLibrary.vcxproj.filters
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,12 @@
<Filter Include="Source Files\Core\KDT">
<UniqueIdentifier>{8fb36afb-73ed-4c3d-8c9b-c3581d80c5d1}</UniqueIdentifier>
</Filter>
<Filter Include="Header Files\Helper\VectorSetReaders">
<UniqueIdentifier>{f7bc0bc7-1af5-4870-b8ee-fabdbabdb4c4}</UniqueIdentifier>
</Filter>
<Filter Include="Source Files\Helper\VectorSetReaders">
<UniqueIdentifier>{5c1449e0-38b7-4c82-976e-cbdc488d3fb5}</UniqueIdentifier>
</Filter>
</ItemGroup>
<ItemGroup>
<ClInclude Include="inc\Core\Common.h">
Expand Down Expand Up @@ -139,6 +145,12 @@
<ClInclude Include="inc\Helper\BufferStream.h">
<Filter>Header Files\Helper</Filter>
</ClInclude>
<ClInclude Include="inc\Helper\VectorSetReaders\DefaultReader.h">
<Filter>Header Files\Helper\VectorSetReaders</Filter>
</ClInclude>
<ClInclude Include="inc\Helper\VectorSetReader.h">
<Filter>Header Files\Helper</Filter>
</ClInclude>
</ItemGroup>
<ItemGroup>
<ClCompile Include="src\Core\VectorIndex.cpp">
Expand Down Expand Up @@ -177,6 +189,12 @@
<ClCompile Include="src\Core\Common\NeighborhoodGraph.cpp">
<Filter>Source Files\Core\Common</Filter>
</ClCompile>
<ClCompile Include="src\Helper\VectorSetReaders\DefaultReader.cpp">
<Filter>Source Files\Helper\VectorSetReaders</Filter>
</ClCompile>
<ClCompile Include="src\Helper\VectorSetReader.cpp">
<Filter>Source Files\Helper</Filter>
</ClCompile>
</ItemGroup>
<ItemGroup>
<None Include="packages.config" />
Expand Down
4 changes: 0 additions & 4 deletions AnnService/IndexBuilder.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -139,15 +139,11 @@
<ItemGroup>
<ClInclude Include="inc\IndexBuilder\Options.h" />
<ClInclude Include="inc\IndexBuilder\ThreadPool.h" />
<ClInclude Include="inc\IndexBuilder\VectorSetReader.h" />
<ClInclude Include="inc\IndexBuilder\VectorSetReaders\DefaultReader.h" />
</ItemGroup>
<ItemGroup>
<ClCompile Include="src\IndexBuilder\main.cpp" />
<ClCompile Include="src\IndexBuilder\Options.cpp" />
<ClCompile Include="src\IndexBuilder\ThreadPool.cpp" />
<ClCompile Include="src\IndexBuilder\VectorSetReader.cpp" />
<ClCompile Include="src\IndexBuilder\VectorSetReaders\DefaultReader.cpp" />
</ItemGroup>
<ItemGroup>
<None Include="packages.config" />
Expand Down
24 changes: 3 additions & 21 deletions AnnService/IndexBuilder.vcxproj.filters
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<?xml version="1.0" encoding="utf-8"?>
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup>
<Filter Include="Source Files">
Expand All @@ -9,12 +9,6 @@
<UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
<Extensions>h;hh;hpp;hxx;hm;inl;inc;xsd</Extensions>
</Filter>
<Filter Include="Header Files\VectorSetReaders">
<UniqueIdentifier>{cf68b421-6a65-44f2-bf43-438b13940d7d}</UniqueIdentifier>
</Filter>
<Filter Include="Source Files\VectorSetReaders">
<UniqueIdentifier>{41ac91f9-6b6d-4341-8791-12f672d6ad5c}</UniqueIdentifier>
</Filter>
</ItemGroup>
<ItemGroup>
<ClInclude Include="inc\IndexBuilder\Options.h">
Expand All @@ -23,27 +17,15 @@
<ClInclude Include="inc\IndexBuilder\ThreadPool.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="inc\IndexBuilder\VectorSetReader.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="inc\IndexBuilder\VectorSetReaders\DefaultReader.h">
<Filter>Header Files\VectorSetReaders</Filter>
</ClInclude>
</ItemGroup>
<ItemGroup>
<ClCompile Include="src\IndexBuilder\Options.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="src\IndexBuilder\ThreadPool.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="src\IndexBuilder\VectorSetReader.cpp">
<ClCompile Include="src\IndexBuilder\main.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="src\IndexBuilder\VectorSetReaders\DefaultReader.cpp">
<Filter>Source Files\VectorSetReaders</Filter>
</ClCompile>
<ClCompile Include="src\IndexBuilder\main.cpp">
<ClCompile Include="src\IndexBuilder\ThreadPool.cpp">
<Filter>Source Files</Filter>
</ClCompile>
</ItemGroup>
Expand Down
5 changes: 2 additions & 3 deletions AnnService/inc/Core/Common/BKTree.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

#ifndef _SPTAG_COMMON_BKTREE_H_
Expand Down Expand Up @@ -366,8 +366,7 @@ namespace SPTAG
int maxcluster = -1;
SizeType maxCount = 0;
for (int k = 0; k < m_iBKTKmeansK; k++) {
void* currCenter = (void*)(args.centers + k * p_index->GetFeatureDim());
if (args.newCounts[k] > maxCount && args.clusterDist[k] > p_index->ComputeDistance(currCenter, currCenter) + lambda*args.counts[k])
if (args.newCounts[k] > maxCount && DistanceUtils::ComputeL2Distance((T*)p_index->GetSample(args.clusterIdx[k]), args.centers + k * p_index->GetFeatureDim(), p_index->GetFeatureDim()) > 1e-6)
{
maxcluster = k;
maxCount = args.newCounts[k];
Expand Down
2 changes: 2 additions & 0 deletions AnnService/inc/Core/VectorIndex.h
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ class VectorIndex

virtual ErrorCode DeleteIndex(ByteArray p_meta);

virtual const void* GetSample(ByteArray p_meta);

virtual ErrorCode SearchIndex(const void* p_vector, int p_neighborCount, bool p_withMeta, BasicResult* p_results) const;

virtual std::string GetParameter(const std::string& p_param) const;
Expand Down
59 changes: 59 additions & 0 deletions AnnService/inc/Helper/VectorSetReader.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

#ifndef _SPTAG_HELPER_VECTORSETREADER_H_
#define _SPTAG_HELPER_VECTORSETREADER_H_

#include "inc/Core/Common.h"
#include "inc/Core/VectorSet.h"
#include "inc/Core/MetadataSet.h"
#include "inc/Helper/ArgumentsParser.h"

#include <memory>

namespace SPTAG
{
namespace Helper
{

class ReaderOptions : public ArgumentsParser
{
public:
ReaderOptions(VectorValueType p_valueType, DimensionType p_dimension, std::string p_vectorDelimiter = "|", std::uint32_t p_threadNum = 32);

~ReaderOptions();

std::uint32_t m_threadNum;

DimensionType m_dimension;

std::string m_vectorDelimiter;

SPTAG::VectorValueType m_inputValueType;
};

class VectorSetReader
{
public:
VectorSetReader(std::shared_ptr<ReaderOptions> p_options);

virtual ~VectorSetReader();

virtual ErrorCode LoadFile(const std::string& p_filePath) = 0;

virtual std::shared_ptr<VectorSet> GetVectorSet() const = 0;

virtual std::shared_ptr<MetadataSet> GetMetadataSet() const = 0;

static std::shared_ptr<VectorSetReader> CreateInstance(std::shared_ptr<ReaderOptions> p_options);

protected:
std::shared_ptr<ReaderOptions> m_options;
};



} // namespace Helper
} // namespace SPTAG

#endif // _SPTAG_HELPER_VECTORSETREADER_H_
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

#ifndef _SPTAG_INDEXBUILDER_VECTORSETREADERS_DEFAULTREADER_H_
#define _SPTAG_INDEXBUILDER_VECTORSETREADERS_DEFAULTREADER_H_
#ifndef _SPTAG_HELPER_VECTORSETREADERS_DEFAULTREADER_H_
#define _SPTAG_HELPER_VECTORSETREADERS_DEFAULTREADER_H_

#include "../VectorSetReader.h"
#include "inc/Helper/Concurrent.h"
Expand All @@ -13,13 +13,13 @@

namespace SPTAG
{
namespace IndexBuilder
namespace Helper
{

class DefaultReader : public VectorSetReader
{
public:
DefaultReader(std::shared_ptr<BuilderOptions> p_options);
DefaultReader(std::shared_ptr<ReaderOptions> p_options);

virtual ~DefaultReader();

Expand All @@ -44,7 +44,7 @@ class DefaultReader : public VectorSetReader
template<typename DataType>
bool TranslateVector(char* p_str, DataType* p_vector)
{
std::uint32_t eleCount = 0;
DimensionType eleCount = 0;
char* next = p_str;
while ((*next) != '\0')
{
Expand Down Expand Up @@ -85,11 +85,11 @@ class DefaultReader : public VectorSetReader

std::size_t m_subTaskBlocksize;

std::atomic<std::uint32_t> m_totalRecordCount;
std::atomic<SizeType> m_totalRecordCount;

std::atomic<std::size_t> m_totalRecordVectorBytes;

std::vector<std::uint32_t> m_subTaskRecordCount;
std::vector<SizeType> m_subTaskRecordCount;

std::string m_vectorOutput;

Expand All @@ -102,7 +102,7 @@ class DefaultReader : public VectorSetReader



} // namespace IndexBuilder
} // namespace Helper
} // namespace SPTAG

#endif // _SPTAG_INDEXBUILDER_VECTORSETREADERS_DEFAULT_H_
#endif // _SPTAG_HELPER_VECTORSETREADERS_DEFAULT_H_
12 changes: 2 additions & 10 deletions AnnService/inc/IndexBuilder/Options.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
#define _SPTAG_INDEXBUILDER_OPTIONS_H_

#include "inc/Core/Common.h"
#include "inc/Helper/ArgumentsParser.h"
#include "inc/Helper/VectorsetReader.h"

#include <string>
#include <vector>
Expand All @@ -16,21 +16,13 @@ namespace SPTAG
namespace IndexBuilder
{

class BuilderOptions : public Helper::ArgumentsParser
class BuilderOptions : public Helper::ReaderOptions
{
public:
BuilderOptions();

~BuilderOptions();

std::uint32_t m_threadNum;

std::uint32_t m_dimension;

std::string m_vectorDelimiter;

SPTAG::VectorValueType m_inputValueType;

std::string m_inputFiles;

std::string m_outputFolder;
Expand Down
43 changes: 0 additions & 43 deletions AnnService/inc/IndexBuilder/VectorSetReader.h

This file was deleted.

11 changes: 11 additions & 0 deletions AnnService/src/Core/VectorIndex.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,17 @@ VectorIndex::DeleteIndex(ByteArray p_meta) {
}


const void* VectorIndex::GetSample(ByteArray p_meta)
{
if (m_pMetaToVec == nullptr) return nullptr;

std::string meta((char*)p_meta.Data(), p_meta.Length());
auto iter = m_pMetaToVec->find(meta);
if (iter != m_pMetaToVec->end()) return GetSample(iter->second);
return nullptr;
}


std::shared_ptr<VectorIndex>
VectorIndex::CreateInstance(IndexAlgoType p_algo, VectorValueType p_valuetype)
{
Expand Down
Loading

0 comments on commit e574ee0

Please sign in to comment.