Skip to content

Make CacheTagArray great again (#221, #225) #257

Merged
merged 21 commits into from Dec 6, 2017

Conversation

denislos
Copy link
Contributor

@denislos denislos commented Dec 3, 2017

I have created BPUCache.h and LRUTagCache.h to implement BPU cache. I have also created some unit tests for it. Actually, there are still tests (read_no_touch test, write test, etc) to be created because I have added virtually only those ones which are tailored to the LRUTagCache.

@denislos denislos closed this Dec 3, 2017
@denislos denislos reopened this Dec 3, 2017
@denislos denislos closed this Dec 3, 2017
@denislos denislos reopened this Dec 3, 2017
@pavelkryukov
Copy link
Member

Why did you remove CacheTagArray?? We need that class for dozens of caches we will implement later.

@denislos
Copy link
Contributor Author

denislos commented Dec 3, 2017

I made a mistake when I renamed it. However, I think it's the way CacheTagArray should look like when hash tables are used. I dropped "bits and bytes" things because I thought it would be better to work with entries in the cache. I haven't changed the logic.

So this is how BPUCache ( my CacheTagArray) works. In my opinion, It's also the way original CacheTagArray works.
Each set is an array of ways. We count sets from 0 to number_of_sets - 1 and we also count ways from 0 to number_of_ways - 1. When a new element is about to be added, its set is evaluated in dependence on the address we pass as an argument to write method, but its way is the way we decide to use in accordance with the LRU algorithm which is implemented independently for each set.

@@ -162,7 +146,7 @@ class BPFactory {
std::exit( EXIT_FAILURE);
}

return map.at( name)->create( size_in_entries, ways, branch_ip_size_in_bits);
return map.at( name)->create( size_in_entries, ways);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we configure # of bits for IP now?

@pavelkryukov
Copy link
Member

I dropped "bits and bytes" things

What do you mean by "bits and bytes"?

I thought

Please, contact me first before making "revolutionary" changes. For the most of cases the things are made here due to a kind of "strategic" plans. I have some experience in writing simulators and I know already that something would be wrong ideas. Cache-which-come-together-with-data is one of them.

it would be better to work with entries in the cache

For many purposes we do not need to model entries in the cache, we may have only tags (for example, data cache — the data is actually taken

Each set is an array of ways. We count sets from 0 to number_of_sets - 1 and we also count ways from 0 to number_of_ways - 1. When a new element is about to be added, its set is evaluated in dependence on the address we pass as an argument to write method, but its way is the way we decide to use in accordance with the LRU algorithm which is implemented independently for each set.

Right, that is how cache works.

/* Same as previous for full-associative cache. */
for ( auto cache_size : cache_sizes)
{
CacheTagArray cta( 1024 * cache_size, 1024 * cache_size / 4);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was deleted with nothing in return.

Copy link
Contributor Author

@denislos denislos Dec 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have said in the pull request description that there are some tests to be created so miss_rate_sim converted to GTest is one of them. I have planned to use it as the main and the only test for key methods like write and read_no_touch. However, I thought it would be enough to have bpu tests and LRUTagCache tests in the initial version.

@denislos denislos changed the title Make BPUCache great again (#221, #225, #77) Make CacheTagArray great again (#221, #225, #77) Dec 3, 2017
@denislos
Copy link
Contributor Author

denislos commented Dec 3, 2017

By the "bits and bytes" things I mean arguments like cache_size_in_bits and addr_size_in_bits which are not needed if the size in entries is used. I have removed branch_ip_size_in_bits for the same reason. What do you mean by cache-which-comes-together-with-data?
Sorry for not asking you first, I thought it would much better if we had more abstract CacheTagArray class, however, I still don't get why the other approach is better in our situation because we do not use explicitly any of them now (e.g. branch_ip_size_in_bits) ( we use default parameters). If we are not ok with default cell structure of CacheTagArray in the future and do not benefit from its simplicity, we still will be able to convert every structure to it by ourselves. Sorry for this.

@@ -43,42 +43,33 @@ template<typename T>
class BP final: public BaseBP
{
std::vector<std::vector<T>> data;
CacheTagArray tags;
CacheTagArray cache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it should be 'tags'

{ }
BP( uint32 size_in_entries, uint32 ways)
: data( ways, std::vector<T>( size_in_entries / ways))
, cache( size_in_entries, ways)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please restore 'branch_ip_size_in_bits'

// we're reusing existing CacheTagArray functionality,
// but here we don't split memory in blocks, storing
// IP's only, so hardcoding here the granularity of 4 bytes:
4,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did that comment go?

// we're reusing existing CacheTagArray functionality,
// but here we don't split memory in blocks, storing
// IP's only, so hardcoding here the granularity of 4 bytes:
4,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to "4"? That line size is used for BPU, but not for data caches.

std::unordered_map<Key, typename std::list<Key>::const_iterator> lru_hash{};

size_type number_of_elements = 0u;
size_type CAPACITY;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any use case to set "size_type" to anything other than "size_t"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks better for me not to convert size_t to uint32 and introduce size_type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think uint32 should be used, as we will never have a cache with # of sets or ways exceeding the 32 bit.


std::pair<bool, uint32> CacheTagArray::read( Addr addr)
std::pair<Addr, uint32> CacheTagArray::read( Addr addr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Addr mean here? There are only two results possible: cache hit (true) and cache miss (false).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, It's definitely a mistake

@pavelkryukov
Copy link
Member

By the "bits and bytes" things I mean arguments like cache_size_in_bits and addr_size_in_bits which are not needed if the size in entries is used.

They are still needed as real HW does aliasing to reduce HW cost. For example, BP may use only 16 LSB of IP. Sometimes it makes a collision if there are two instructions with same 16 LSB, but it happens rarely while much HW is reduced.

@pavelkryukov
Copy link
Member

we do not use explicitly any of them now

It is not the reason to remove anything. There are a lot of stubs which may be unused at the moment, but they are required on the next iteration of development. If I have to remove something, I’ll remove it by myself or create a new task for student. If you think that something should be removed, please consult me/code owner first, as there might be a reason to keep it.

@pavelkryukov
Copy link
Member

it would much better if we had more abstract CacheTagArray

We had it as much abstract as possible, excepting the LRU policy. By adding entries to the CacheTagArray, it becomes a Cache class which has less flexibility (you have to handle the data)

@pavelkryukov
Copy link
Member

What do you mean by cache-which-comes-together-with-data?

To keep entries (BP entries, for instance) and cache tags in the same class. As I explained, for many cases we can have cache tags only.

uint32 size_of_line = 4,
uint32 addr_size_in_bits = 32)
: Log( false)
, number_of_sets( check_arguments( size_in_bytes,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain the reason why you removed CacheTagArrayCheck class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the separate class just to check arguments?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? C++ guarantees that ctor of CacheTagArrayCheck class is executed before anything else, so if arguments are ill-formed, we won't go inside the function and get any undefined behavior. Additionally, if an alternative implementation of CacheTagArray existed (some unusual cache), programmer would inherit from CacheTagArrayCheck and explicitly get all the checks. Instead, your function "check_arguments" may be easily dropped by accident, plus it has very unclear return value (I would expect void or Boolean true if everything correct at least).

std::pair<bool, uint32> read_no_touch( Addr addr) const;
/* create new entry in cache */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting valid comments is a sabotage.

#include <infra/types.h>
#include <infra/log.h>

/* Replacement algorithm modules (LRU). */
class LRUInfo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LRU was intentionally put to the separate class as there are a lot of alternative policies (optimal replacement, pseudo-LRU, MRU, weighted-LRU etc.). So, please put the separate class back, later we will introduce the polymorphic replacement policy selection.

@@ -1,82 +1,58 @@
/**
* cache_tag_array.h
* Header for the cache tag array model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting valid comments is a sabotage.

return addr / line_size;
}
cache[ num_set].update( num_tag);
const auto&[ is_hit, value] = cache[ num_set].find( num_tag);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Find is counter-intuition. Usually when we write data to the cache, we should find only a LRU way.

std::pair<bool, uint32> result;

// try to find an element in the empty cache
GTEST_ASSERT_NO_DEATH( result = cache.find( 10););
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no point to have "GTEST_ASSERT_NO_DEATH". If something failed inside, it would crash the test anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't be able to tell what exactly went wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both cases you cannot tell what went wrong without a debugger and/or code review. Wrapping everything into these brackets gives extra ~5% of information, but readability of the code is around zero.

Copy link
Member

@pavelkryukov pavelkryukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do integration step by step. Could you please resubmit review only with the test?

@denislos
Copy link
Contributor Author

denislos commented Dec 4, 2017

Sorry, I don't get it. What do you mean by resubmitting review only with the test? You want me to commit only with the unit tests, right?

@pavelkryukov
Copy link
Member

Right.

@denislos denislos reopened this Dec 5, 2017
@denislos denislos changed the title Make CacheTagArray great again (#221, #225, #77) Make CacheTagArray great again (#221, #225) Dec 5, 2017
lru.touch( set_num, way_num); // update LRU info
}
if ( result.first)
lru_module.update( num_set, num_tag); // update LRU if it's a hit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# of way should be used here, not a tag, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we need a tag here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we need not. LRU class should know nothing about tags. It has only two basic operations:

  • Put some way to the head of the list (== touch)
  • Take a way from the tail ( == update == allocate)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then there should be a kind of reflection of WAYS to TAGS( e,g, another map which is not good), because we need somehow to know the least recently used tag to erase it from the hash table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I will use another map.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then there should be a kind of reflection of WAYS to TAGS( e,g, another map which is not good), because we need somehow to know the least recently used tag to erase it from the hash table.

Yes, it should be used for CacheTagArray. It is a common operation for cache ("check what data is in way 1 of set 4").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I will use another map.

Note that for PseudoLRU there is no map inside LRU class, it uses a different mechanism. We might use FIFO replacement as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used a map inside CacheTagArray.

uint32 write( Addr addr);

uint32 set( Addr addr) const;
Addr tag( Addr addr) const;
uint32 set( Addr addr) const { return (addr / line_size) & (number_of_sets - 1); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently I've added aliasing to these functions, please return it back.

: lru_info( number_of_sets, LRUCacheInfo<Addr>( number_of_ways))
{ }

std::pair<Addr, uint32> update( uint32 num_set, Addr addr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be two methods:

  • Touch (when an existing element is accessed)
  • Update (when an new element is added)

std::pair<bool, uint32> CacheTagArray::read( Addr addr)
{
const auto lookup_result = read_no_touch( addr);
const auto&[ is_hit, way_num] = lookup_result;
auto result = read_no_touch( addr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please use C++17 syntax?

const auto&[ is_hit, way_num] = lookup_result;

for ( auto it = list.begin(); it != list.end(); ++it)
std::size_t return_value = number_of_elements;

if ( number_of_elements == CAPACITY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may initialize the hash table in ctor and remove number_of_elements

const Addr addr_mask;

// maps to convert num_ways to tags
std::vector<std::unordered_map<uint32, Addr>> ways_to_tags;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be a std::vector<std::vector<Addr>>, right?

size_of_line,
addr_size_in_bits)
, number_of_sets( size_in_bytes / ( ways * size_of_line))
, number_of_ways( ways)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were these fields moved from CacheTagArrayCheck?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need them there? I mean, It's strange to have them in the "Check" class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Denis, that strange kind of discussion is repeating over and over. You are the person who did the changes, so I should be the guy who expects the answers which explain your motivation.

We need it here because we may have it here. "CacheTagArrayCheck" has the highest level of abstraction, so if a class was inherited from it we would not think about correctness of size variables anymore. We may add more methods to the base class which operate with sizes only and then use them for some constant expressions or run-time checks etc. I agree that "Check" may be not the best name, but "CacheTagArraySizeBaseWithBoundaryChecks" has some drawbacks either.

There are a lot of things in C++ programs which may look strange, but they can be even the only one possible solution originated from days and months of the code evolution. If they don't contain a bug, it is better not to touch them.

if ( line_size == 0)
serr << "ERROR: Wrong arguments! Line size should be greater than zero"

if ( size_of_line == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"line size" is a common idiom for caches, please move it back.

const Addr addr_mask;

// to convert num_ways to tags
std::vector<std::vector<std::pair<bool, Addr>>> ways_to_tags;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::pair is not the best idea.

  1. It is usually unclear what "first" and "second" mean
  2. One day we will need to add one more field to the table. With pair/tuple we shall re-write everything!!

@pavelkryukov pavelkryukov merged commit 97760b9 into MIPT-ILab:master Dec 6, 2017
@denislos denislos deleted the lru branch February 3, 2018 17:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants